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Preface 


Since the seminal mean-variance analysis was introduced by Markowitz (1952), 
the portfolio management theory has been expanded to take account of dif- 
ferent features: 


e Dynamic portfolio optimization as per Merton (1962); 


e Choice of new decision criteria, based on risk aversion (utility functions) 
or risk measures (VaR, CVaR and beyond); 


e Market imperfections, e.g., transaction costs; and, 


e Specific portfolio strategies, such as portfolio insurance or alternative 
methods (hedge-funds). 


At the same time, many new financial products has been introduced, based 
in particular on financial derivatives. 

Due to this intensive development and increasing complexity, this book has 
four purposes: 


e First, to recall standard results and to provide new insights about the 
axiomatics of the individual choice in an uncertain framework. A concise 
introduction to portfolio choice under uncertainty based on investors’ 
preferences (usually represented by utility functions), and on several 
kinds of risk measures. These theories are the fundamental basis of 
portfolio optimization. 


- Chapter 1 recalls the seminal approach of the utility maximization, 
introduced by Von Neumann and Morgenstern. It also deals with 
further extensions of this theory, such as weighted expected utility 
theory, non-expected utility theory, etc. 


- Chapter 2 contains a survey about a new approach: the risk measure 
minimization. Such risk measures have been recently introduced 
in particular to take better account of nonsymmetric asset return 
distributions. 


e Second, to provide a precise overview on standard portfolio optimiza- 
tion. Both passive and active portfolio management are considered. 
Other results, such as risk measure minimization, are more recent. 
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- Chapter 3 is devoted to the very well-known Markowitz analysis. 
Some extensions are analyzed, in particular with risk minimization 
constraints such as safety criteria. 


- Chapter 4 deals with two important standard fund managements: 
managing indexed funds and benchmarked portfolio optimization. 
In particular, statistical methods to replicate a financial index are 
detailed and discussed. As regards benchmarking, the tracking er- 
ror is computed and analyzed. 


- Chapter 5 recalls results about the main performance measures, such 
as the Sharpe and Treynor ratios and the Jensen alpha. 


e Third, to make accessible the literature about stochastic optimization 
applied to mathematical finance (see for example Part ITI) to students, 
to researchers who are not specialists on this subject, and to financial 
engineers. In particular, a review of the main standard results both for 
static and dynamic cases are provided. For this purpose, precise math- 
ematical statements are detailed without “too many” technicalities. In 
particular: 


- Chapter 6 provides an introduction to dynamic portfolio optimiza- 
tion. The two main methods are the theory of stochastic control 
based on dynamic programming principle and, more recently, the 
martingale approach jointly used with convex duality. 


- Chapter 7 gives two important applications of previous results: the 
search for an optimal portfolio profile and the long-term manage- 
ment. 


- Chapter 8 is the more “technical” one. It provides an overview on 
portfolio optimization with market frictions, such as incomplete- 
ness, transaction costs, labor income, random time horizon, etc. 


e Finally, to show how theoretical results can be applied to practical and 
operational portfolio optimization (Part IV). This last part of the book 
deals with structured portfolio management which has grown signifi- 
cantly in the past few years. 


Preface VII 


- Chapter 9 is devoted to portfolio insurance and, in particular, to 
OBPI and CPPI strategies. 


- Chapter 10 shows how common strategies, used by practitioners, may 
be justified by utility maximization under, for example, guarantee 
constraints. It summarizes the main results concerning optimal 
portfolios when risk measures such as expected shortfall are intro- 
duced to limit downside risk. 


- Chapter 11 recalls some problems when dealing with hedge funds, in 
particular the choice of appropriate performance measures. 
As a by-product, special emphasis is put on: 
e Utility theory versus practice; 
e Active versus passive management; and, 


e Static versus dynamic portfolio management. 


I hope this book will contribute to a better understanding of the modern 
portfolio theory, both for students and researchers in quantitative finance. 


I am grateful to the CRC editorial staff for encouraging this project, in par- 
ticular Sunil Nair, and for the help during the preparation of the final version: 


Michele Dimont and Shashi Kumar. 


Jean-Luc PRIGENT, PARIS, February 2007. 
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Part I 
Utility and risk analysis 


“(Under uncertainty] there is no scientific basis on which to form any cal- 
culable probability whatever. We simply do not know. Nevertheless, the 
necessity for action and for decision compels us as practical men to do our 
best to overlook this awkward fact and to behave exactly as we should if 
we had behind us a good Benthamite calculation of a series of prospective 
advantages and disadvantages, each multiplied by its appropriate probability 
waiting to be summed.” 


John Maynard Keynes, “General Theory of Employment,” Quarterly Journal 
of Economics, (1937). 


2 Portfolio Optimization and Performance Analysis 


Nowadays, financial theory is one of the major economic fields where decision- 
making under uncertainty plays a crucial part. Actually, many sources of risk 
(market, model, liquidity, operational, etc.) have to be taken into account and 
carefully examined for most financial activities, such as pricing and hedging 
derivatives, asset allocation, or credit portfolio management. 


Assume that these risky events are identified with, for example, probability 
distributions that may be objective or subjective. Nevertheless: 


How can we model individual decisions under uncertainty? 


Is it possible to rationalize traders or portfolio managers strategies? Can 
we provide them with sufficiently operational and computational tools to try 
to improve their decision process? 


As it is well-known, a unified framework can be proposed to quantify uncer- 
tainty in financial modelling: the utility theory, and especially the expected 
utility theory introduced by John von Neumann and Oskar Morgenstern in 
[400] and recognized for its usefulness and applicability. 


Utility functions are based on risk aversion modelling from which the notion 
of risk premium can be defined. In Chapter 1, basic notions of the theory of 
decision under uncertainty are recalled. The emphasis is put on the expected 
utility theory and various risk aversion notions. One of the advantages of the 
expected utility is that it provides an operational tool to determine explicit 
portfolios under mild assumptions. In this framework, the risk-aversion allows 
a calibration of the portfolio weights, as detailed in Part III. 


Nevertheless, the increasing development of the so-called behavorial eco- 
nomics and finance, based on empirical evidence, justifies sections devoted 
to alternative preference representation theories. Indeed, many experimental 
studies have shown that individuals (in particular the investors) do not act 
according to the expected utility theory. This can partly explain investment 
anomalies such as insufficient diversification, financial bubbles, etc. 


However a new stream has emerged based on bank activity regulation. It 
focuses in particular on potential losses and downside risk. 


In [373] and [374], Markowitz proposed to measure risk of portfolio returns 
by means of their variances which involve judiciously the joint distribution 
of returns of all assets. Despite its simplicity and tractability, the Markowitz 
model has two pitfalls: 


e First, the probability distribution of each asset return is characterized 
only by its first two moments. In the case of nonGaussian distributions 


Part I 3 


(even symmetrical), the Markowitz model and utility theories are mainly 
compatible for quadratic utility functions. 


e Second, the dependence structure is only described by the linear corre- 
lation coefficients of each pair of asset returns. As shown, for example, 
by Alexander [14], the linear correlation coefficient is not always appli- 
cable. It also may imply incorrect results when probability distributions 
are not elliptic (see Joe [307]), as proved for instance by Embrechts et al. 
([201] and [202]). In that case, severe losses can be observed if extreme 
events are too underestimated. 


What kind of risk measures can be introduced? 


Unlike dispersion risk measures such as the standard deviation, other mea- 
sures have been proposed, based rather on downside risks. 


From the seminal paper by Artzner, Delbaen, Eber and Heath [31], specific 
axioms have been introduced to model risk measures (coherent), and further 
examined and generalized by Féllmer and Schied [236] (convex measures). 


Chapter 2 is devoted to the definitions and main properties of such risk 
measures. Note that, as for preference representation, the theory of risk mea- 
sures is not yet achieved, in particular when they have to be defined in a 
dynamic framework. Besides, both approaches are linked, as shown by recent 
results. Among the possible operational risk measures, the value-at-risk and 
its “coherent” extension, the expected shortfall, have emerged as important 
tools to bank regulation and risk management. 


This is the reason why in Chapter (2), some emphasis is put on these 
measures, in particular on some results about their estimation and sensitivi- 
ties computation. Under some additional and “rational” specific axioms, risk 
measures can be defined from the expected shortfall (the so-called spectral 
risk measures). 


Portfolio management can also involve such measures to limit risk exposure, 
as detailed for instance in Part IV, Chapter 10. 


Chapter 1 


Utility theory 


The importance of risk and uncertainty in economic analysis was suggested for 
the first time by Frank H. Knight in his seminal treatise Risk, Uncertainty and 
Profit (see [328]). Previously, very few economists considered that risk and 
uncertainty might play a key role in economic theory, except for some notable 
examples like Carl Menger [384], Irving Fisher [227] and Francis Edgeworth 
[186]. The problem was: 


e First, to define precisely the notions of “uncertainty” or “risk” when 
events are random; and, 


e Second, to model the choice process within risk and uncertainty. 


The notion of choice under risk and uncertainty was not well modelled for a 
long time, despite the results of Hicks [293] and Marschak [377], who under- 
stood that preferences should be defined also on distributions, to take account 
simultaneously of the evaluation of the level of risk or uncertainty, and of the 
pure preferences over outcomes. However, Bernoulli [54] had formerly intro- 
duced the notion of expected utility to solve the famous St.Petersburg paradox 
posed in 1713. The expected utility theory allows the representation of the 
level of satisfaction by the sum of utilities from outcomes weighted by the 
probabilities of these outcomes. Nevertheless, in that case, a gain can in- 
crease utility less than a decline can reduce it, which was not considered as 
rational. 


In the seminal Theory of Games and Economic Behavior, John von Neu- 
mann and Oskar Morgenstern [400] succeeded in providing a rational founda- 
tion for decision-making under risk according to expected utility properties. 
At this time, this axiomatic foundation for decision-making under risk was 
not well understood. This theory was further developed by Marschak [378], 
Samuelson ({445] and [446]), Herstein and Milnor [291] and others. 

The expected utility hypothesis was rehanced in the famous Foundations 
of Statistics by Savage [449]. Savage proposed to deduce the expected utility 
property without imposing prior objective probabilities but by determinat- 
ing implicit subjective probabilities. This approach was further studied by 
Anscombe and Aumann [26]. Thus the von Neumann-Morgenstern theory 
was extended by the Savage-Anscombe-Aumann “subjective” approach. 
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The “state-preference” approach to uncertainty was introduced by Arrow 
[28] and Debreu [152]. It does not necessarily assume the existence of objec- 
tive or subjective probabilities. Rather, it concerns actual goods than money 
amounts and has been applied to study general economic equilibria. 


The notion of “risk aversion” was introduced by Friedman and Savage [242] 
and by Markowitz [374]. The measures of risk aversion were examined by 
Pratt [412] and Arrow [29], and later studied by Ross [434]. The different 
notions of risk aversion have been further developed by Yaari [506] and K- 
ihlstrom and Mirman [327]. The notions and definitions of “riskiness” based 
on stochastic dominance were suggested by Rothschild and Stiglitz ([435] and 
[436]) and Diamond and Stiglitz [167]. 


The expected utility assumption for modelling choice under risk and un- 
certainty has been discussed and disputed, in particular by Allais [19] and 
Ellsberg [196]. This debate has generated many alternative approaches to 
expected utility theory. Some of them have been based on experimental expe- 
riences, such as in Kahneman and Tversky [314]. The main alternative mod- 
els are: weighted expected utility (Allais [20], Chew and McCrimmon [120]); 
rank-dependent expected utility (Quiggin [420], Yaari ((506], [507]), and more 
recently, the cumulative prospect theory by Kahneman and Tversky [496]); 
non-linear expected utility (Machina [367]); and regret theory (Loomes and 
Sugden [365]) to take account of preference reversals. Other alternative ex- 
pected utility theories have been developed within the framework of Savage’s 
subjective probability: non-additive expected utility (Schmeidler [454]) and 
state-dependent preferences (Karni [323]). 


To summarize, the purpose of the decision theory is to provide analytical 
tools of different degrees of formality in order to model the behavior of a 
decison maker who has to choose among a set of alternatives with different 
consequences. Typically, since Knight [328], three cases are distinguished, 
according to the degree of information: 


e First, the environment is certain: the agent perfectly knows the event 
that will occur in the future. 


e Second, the environmment is risky: this means existence of uncontrol- 
lable random events for which the modelling of a probability space can 
be proposed, in particular a probability distribution can be determined. 


e Finally, the environmment is uncertain: in that case, the probability 
distribution is unknown. 


Financial theory is mainly concerned with the second situation. However, 
for the third case, note that under some assumptions a subjective probability 
may exist, as proved by Savage [449]. 
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1.1 Preferences under uncertainty 


In order to model any decision problem under risk, it is necessary to in- 
troduce a functional representation of preferences which measures the degree 
of satisfaction of the decision maker. This is the purpose of the utility the- 
ory. The investor is supposed to be “rational”: this means that his choices 
are made according to given “good” rules which are “stable” over time (in 
some sense). Thus a binary relation on possible outcomes can be proposed to 
analyze his behavior. Specific axioms are introduced to describe his “ratio- 
nality.” Then, for this given identified choice functional, his optimal decision 
(for example his investment strategy) is determined from the “maximization” 
of this criterion. 


1.1.1 Lotteries 


Within a risky framework, first we must pick out all possible outcomes 
which may have an impact on the consequences of the decisions. Secondly, 
we must associate each random event with a probability. 


Let 2 represent the set of possible outcomes. To simplify the exposition, 
we suppose Q is finite: 
Q = {w1,..., Wm}. 
Let p = {p1, ...; Pm} be the probability of occurence of 2: 
Vi, 0 <p; < 1 and Xni = 


a 


DEFINITION 1.1 A lottery L is defined by a vector {(w1, p1), ---; (Wm; Pm) }- 
The set of all lotteries with the same given set of outcomes Q is denoted by 
L. A compound lottery Le is a lottery whose outcomes are also lotteries. 


Consider for example a compound lottery with two outcomes: the lottery 
L° and the lottery L? with respective probabilities a and 1 — a. Then the 
probability that the outcome of Le will be w; is given by: 


pi = ap? + (1 — ap}. 
Therefore, Le has the same vector of probabilities as the convex combination: 


aL’ + (1—a)L?. 
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1.1.2 Axioms on preferences 


The decision maker is assumed to be “rational” if his preference relation 
(denoted by >) over the set of lotteries £ is a binary relation which satisfies 
the following axioms: 

Axiom 1: 


e The relation > is complete (all lotteries are always comparable by >): 
VL? € L, YL? € L, L° > L’ or L? > L°. 
The indifference relation ~ associated to the relation = is defined by: 
VL? EL, YL’ € L, L° w D’? 4> L° > Land D > L°. 
e The relation > is reflexive: 


VLEL,L&= L. 


e The relation > is transitive: 
VL? € L, YL? € L, VL? € L, L° > L’ and L > L° = Par, 


Another standard assumption is the continuity: small changes on probabilities 
do not modify the ordering between two lotteries. This property is specified 
in the following axiom: 

Axiom 2 : The preference relation > on the set £ of lotteries is such that 
YL? € L, YL? € L, YL? € L, if L° > L® > L° then there exists a scalar 
a € [0,1] such that: 

L? ~ aL’ + (1—a)L. 
This continuity axiom implies the existence of a functional U : £ — R such 
that: 
VL? € L, VL? € L, L° > L? UL) > UL’). 

The previous axioms were already well-known in the economic theory of 
consumer choice (sometimes called the “weak order” axioms). To develop the 
analysis of economics under uncertainty, more properties must be imposed 
on the preferences. One of the most important conditions that can be added 
to describe the behavior of the economic agent (here the “investor” ) is the 
independence axiom, which is the foundation of the standard theory under 
uncertainty, but considered more troublesome (stated as in Jensen [301]): 


Axiom 3: The preference relation > on the set £ of lotteries is such that 
VL* € L, YL? € L, VL° € L and for all a € [0,1], 
L° > L? > aL’ + (1-—a)L > aL? +(1-a)L’. 


This property means that if two lotteries L° and L? are mixed in the same 
way with any third lottery L° then the preference ordering between the two 
new mixed lotteries is not modified. 
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1.2 Expected utility 


The independence axiom, implicitly introduced in von Neumann and Mor- 
genstern [400], characterizes the expected utility criterion: indeed, it implies 
that the preference functional U on the lotteries must be linear in the proba- 
bilities of the possible outcomes: 


THEOREM 1.1 

Assume that the preference relation = on the set L of lotteries satisfies the 
continuity and independence axioms. Then, the relation = can be represent- 
ed by a preference functional that is linear in probabilities: there exists a 
function u defined (up to a positive linear transformation) on the space of 
possible outcomes Q and with values in R such that for any two lotteries 
L° = {(p%,...,p%)} and LÈ = {(p?®,...,p®)}, we have: 


L° > D 4 X u(wi)p? > X u(wi)p?. (1.1) 


i=l i=l 


PROOF [In what follows, only a sketch of the proof is presented. We 
have mainly to prove that for any two lotteries L and L’, and any compound 
lottery L=aL* + (1 — a) L’: 


UlaL* + (1 — a)L?] = olf[L*] + (1 — a) UL]. 

e First, consider the worst and best lotteries, L° and L’° in £, with 
respect to the preference functional U. They are obtained by minimizing 
and maximizing U on a closed preference interval of £. Then, for any 
lottery L in £, we have: LY? > L > L’*. Thus, by the continuity 
axiom, there exist two scalars 8° and 3° in [0,1] such that: 

Lew Gr Lr as (1 _ BAL”? and L® oe BEL?” 4 (1 _ BALYS, 
Note that 8° and 3° are unique. 
e (Mixture monotonicity.) We have: 


L* > L —> 67 > p. 


Indeed, if 8% > 6°, then the parameter 3 = re is in [0,1]. Then, we 
have: 


B°L"? + (1 — BX) Lh ~ BL + (1 — B) LP + (1 — B) L”). 
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Moreover, by definition of L°, Le > Be Lhe + (1 — B°)L”°. Therefore, 
using the independence axiom, we deduce: 
BL +L) PL + =P EP] > 
BIBL ae) EP?) + (1 = BBL” pales eh. 
Consequently, Le = BL»! + (1 — 8*)L”° > L? = Bebe + (1 — 6°) L”?. 
This result is quite intuitive, since it means that if we construct two 


compound lotteries L® and L? with different weights, then we prefer the 
compound lottery in which the best lottery is given the higher weight. 


e Consequently, we conclude that the functional M, such that for any 
lottery L ~ BL” + (1 — B)L”° we have U(L) = 8, satisfies exactly 
Definition (1.1.2) of a preference functional associated to the preference 
relation >. 


e Finally, we must prove that: 
UlaL* + (1 —a)L") = a6* + (1-0) 8”. 
This is equivalent to show that: 
aL*+(1—a)L? ~ (aB*+(1—a) 8°) L® + (a(1— 6%) + (1—a)(1— 8°) L”?. 
Using twice the independent axiom, we get: 
aL* + (1—a)L? ~ ajb L” + (1— 8) EY] + (1-a)L? 
~ alf°L? + (1 — 6)L"°] + (1 — a) [BL + (1-6) 2] 
~ (a6% + (1 — a) 6°)L* + (a(l — 67) + (1 — @)(1 — 6”) L”°. 


Then, the previous result is extended to the whole space of lotteries £ 
and the proof is finished. 





[ 


REMARK 1.1 
1) The property in Theorem (1.1) is an equivalence, i.e. expected utility im- 
plies the three axioms. 


2) Note that the previous results have been proved for simple lotteries, i.e. 
probability distributions which take positive values only for a finite number 
of outcomes. We have to extend these results to continuous spaces (“infinite 
support”): i.e. for any probability measure P on Q, we must prove an analogue 
to the expected utility decomposition 


UP) = | uo). (1.2) 


For this purpose, the continuity axiom (or Archimedian axiom) has to be 
supplemented as shown in Fishburn [223]. 
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1.3 Risk aversion 


When facing alternatives with comparable returns, what are the attitudes 
of investors towards risk? Despite, for example, state-owned lotteries, usual 
observations on financial or insurance markets show that generally human 
beings are risk-averse. For example, they choose to invest on risky assets 
only if their expected returns are significantly larger than the riskless one. 
To illustrate this notion, consider the construction of Friedman and Savage 
[242]: let X a random variable with only two values x; and £2, and let p, the 
probability of zı, and (1— p) be the probability of x2. Let u represent a utility 
function defined on the outcomes. Consider the following two lotteries L° and 
LÈ: lottery L° pays the amount E[X] with probability equal to 1. Lottery L? 
pays xı with probability p, and x2 with probability (1 — p). These lotteries 
have the same expected income, but an investor who is averse to risk would 
select L° instead of L°. Therefore, we have: 




















U[L*] = u(E[X]) and U[L’] = E[u(X)]. (1.3) 




















Then, as illustrated in Figure (1.1), the concavity of u implies that the utility 
of expected return u(E[X]) is greater than the expected utility E[U(X)]. 


















































B 








zı C|X] ELX] x2 
FIGURE 1.1: Risk aversion, certainty equivalence, and concavity 
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Indeed, by the definition of concavity, we have: 
VA € [0,1], Va € R, Vb € R, Au(a) + (1 — A)u(b) < u(Aa + (1 — A)b). 
The basic mathematical result is the well-known Jensen’s inequality: consider 


a concave real-valued function f and a real-valued random variable Y with 
finite expectation. Then we have E[f(Y)| < f(E[Y]). 


























Another way to represent this risk-aversion is to introduce the certainty 
equivalent of the lottery L°. This lottery, denoted by C[X], is the sure or 
certainty-equivalent lottery which yields the same utility as the random lot- 
tery L>. Thus, the investor is indifferent to the choice of receiving C[X] 
with certainty or investing on the risky lottery L? with expected return E[X]. 
The risk-aversion is equivalent to the inequality C[X] < E[X]. The differ- 
ence m[X] = E[X] — C[X] is called the risk-premium as introduced in Pratt 
[412]. It is the maximum amount that the investor accepts to lose in order 
to get a riskless income. From these properties, we can propose the following 
definitions: 






































DEFINITION 1.2 

1) An investor is risk-averse if C|X] < ELX], or equivalently if n| X] > 0, for 
all random variables X. 
2) An investor is risk-neutral if C|X] = E[X], or equivalently if |X] = 0, for 
all random variables X. 
3) An investor is risk-loving if C|X] > EX], or equivalently if x[X] <0, for 
all random variables X. 
































Using the properties of concavity/convexity of utility functions, we deduce 
a characterization of the risk-aversion: 


THEOREM 1.2 
Let u be a utility function representing preferences over the set of outcomes. 
Assume that u is increasing. Then: 
1) The function u is concave if and only if the investor is risk-averse. 
2) The function u is linear if and only if the investor is risk-neutral. 
3) The function u is convex if and only if the investor is risk-loving. 





PROOF Consider for example the concavity case. Then: VA € [0,1], Va € 
R, Vb € R, Au(a) + (1 — A)u(b) < u(Aa + (1 — A)b). But ELX] = Aa + (1 — A)b 
and E[u(X)] = Au(a) + (1 — A)u(b). By definition, u(C[X]) = E[u(X)]. Thus, 
u(C[X]) < u(E[X]). Therefore, since u is increasing, we deduce: CX] < E[X], 
which is the definition of risk-aversion. 

U 
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REMARK 1.2 As noted in [242], an investor’s utility function may have 
different curvatures: For example, he may be risk-averse for small and very 
high wealth levels, but risk-loving for intermediate levels. In that case, his 
utility function has a double inflection. However, such behavior modelling can 
be criticized as was done by Markowitz [374]. 


1.3.1 Arrow-Pratt measures of risk aversion 


Human beings preferences are heterogeneous: some may prefer safety to 
risk, others do not. In the first case, they will invest significantly on “riskless” 
assets, treasury bonds for example. In the second case, they will purchase 
stocks that will represent a high proportion of their financial portfolios. But, 
how can we measure the “degree” of risk aversion of an investor? Since for ex- 
ample utility functions are defined up to linear transformations, the concavity 
itself is not sufficient to characterize this degree. Another possible approach 
is to examine the risk premia and to relate them to concavity. This is the 
way chosen by Pratt [412] and Arrow [29]. Consider the following result due 
to Pratt: 


DEFINITION 1.3 Let u and v be two utility functions representing pref- 
erences over wealth. The preference u has more risk-aversion than v if the 
risk-premia satisfy: Ty(X) > Ty(X), for all random real-valued variables X. 


THEOREM 1.3 
Let u and v be two utility functions representing preferences over wealth. 
Assume that they are continuous, monotonically increasing, and twice differ- 
entiable. Then the following properties are equivalent and characterize the 
“more risk- aversion”: 

1) The derivatives of both utility functions are such that: —“7(x) > —(z), 
for every x in R. 

2) There exists a concave function ® such that: u(x) = ®[v(x)], for every 
x in R. 

3) The risk-premia satisfy: Tu(X) > my(X), for all random real-valued 
variables X. 


PROOF Case 1: (1)=(2). 
Since the function v is monotonic and continuous, v has an inverse v7}. Define 
the function ® by: ®(y) = uo v™!(y). Then, by construction, we have: 

u(x) = ®[v(x)]. 


Since the functions u and v are twice differentiable, we deduce that ® is also 
twice differentiable and: 
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Thus, ®’ is non-negative. Differentiating again, we get: 


w (x) = $ [u(2)Jv?(x) + B'[v(x)]v” (x). 
Therefore: 
w (x) = $ [u(x)v (x) + u!(x)v” (x)/v'(2), 
and finally we get: 


v” (x)/v' (x) — w” (x) /u' (x) = (—®” [v(x)])(v? (x) /u'(2)), 


with (v? (x)/u'(£)) > 0. Now, from assumption (1), we have: 


Consequently, 6” is negative and ® is concave. 


Case 2: (2)=(3). 
By definition of C[X], we have: u(C,[X]) = E[u(X)]. Therefore, since u(x) = 
®[v(x)], then u(C,[X]) = E[®[v(X)]]. Now, since ® is concave, we have: 


E[®[v(X)] < PER], 















































which implies: 
u(Cu[X]) < &[v(C.[X])]. 


Finally, since u(.) = [v(.)] is increasing, we deduce: 
CulX] < Co[X], 
which implies: 7,(X) > 7,(X). 


Case 3: (3)=(1). Equivalently, we can prove that not(1)=not(3). If “not(1)”, 
then we have: 


” 2) 


-2 (a) < -7 (za), 


for some a in R. By continuity, there exists a neighborhood V(a) for which 
this inequality still holds for all x € V(a). Consider a random variable X 
with values in V(a) and 0 otherwise. First, we can prove that the previous 
tranformation ® is convex on the set V(a): In the proof (1)=(2), we have 
seen that 


L(a) - (e) = -k®[o(2)]. 


But on (V)(a), we have 


which implies that ®” > 0 on (V)(a). Second, since ® is convex, using the 
result (2)=(3), we deduce: 7,(X) < m(X). Thus, not(1)=not(3). 
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REMARK 1.3 Using the Taylor approximation at E[X], we have: 






















































































a[u(X)] = w(E[X]) + (1/2)u” (E[X])E[(X — E[X])”], (1.4) 

and also 
u(E[X] — 7[X]) = u(E[X]) — u'(E[X])x[X]. (1.5) 
But, by definition, u(E[X]—7[X]) = E[u(X)]. Denote o% = E[(X — E[X])?]. 
































Then: 

















T|X] == —[u” (E[X])/u'(E[X])]ox. (1.6) 























Since, oł > 0, the ratio —u” (E[X])/u’(E[X]) can be also considered as a 
measure of risk-aversion. Note that it is positive, since the utility function u 
is increasing and concave. Note also that when X = Vo + Y with E[Y] = 0, 
then E[X] is equal to the initial amount Vo invested on the market. 1 





























DEFINITION 1.4 The term A(x) = —u” (x)/u'(x) is called the Arrow- 
Pratt Measure of Absolute Risk-Aversion (ARA). Another measure allows us 
to take account of the level of wealth: the ratio R(x) = —xu” (x)/u'(x) which 
is called the Arrow-Pratt Measure of Relative Risk-Aversion (RRA). 


REMARK 1.4 From the geometrical point of view, note that the curva- 
ture of the utility function u is equal to 


p(u)(x) = u” (x) /[1 + u? (x)]?®. (1.7) 


Taking the absolute value of p, we get the risk-aversion —u” (x) /[1+u’?(x)]?/3, 


which is very close to the ARA measure. 


1.3.2 Standard utility functions 


From the previous risk-aversion measures, we can characterize some stan- 
dard utility functions, and in particular the general class of HARA utilities 
which are useful to get analytical results. 


DEFINITION 1.5 A utility function u is said to have harmonic absolute 
risk aversion (HARA) if the inverse of its absolute risk aversion is linear in 
wealth. 


PROPOSITION 1.1 
HARA utility functions u take the following form: 


u(x) =a (0 + Z) ma (1.8) 
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with u defined on the domain b+ = > 0. The constant parameters a, b, and 
c satisfy the condition: a(1 —c)/c> 0. 
The ARA is given by: 


Alz) = (b+ J: (1.9) 


which clearly has an inverse linear in wealth x. To ensure that u' > 0 and 
u” <0, it is assumed that a (1 — c) /c > 0. 


Usually, three subclasses are distinguished: 


e Constant absolute risk aversion (CARA). If the parameter c goes to 
infinity then we obtain A(x) = A constant. In that case, the utility 
function u takes the form : u(x) = -era Note that the RRA is 
increasing in wealth. 


Constant relative risk aversion (CRRA). If c = 0, then R(x) = c con- 
stant and, up to a linear transformation, we get: 


oe er a 


Note that if c < 1, then utility goes from 0 to ov, and if c > 1, then 
utility goes from —oo to 0. However, in all cases, this kind of utility 
function exhibits a decreasing absolute risk aversion (DARA). 


e Quadratic utility function. Consider the case c = —1. Then the utility 
function u is quadratic. Note first that we have to restrict its domain, 
since u is decreasing on |b, oo|. Second, the ARA is increasing (IARA) 
with wealth, which unfortunately implies that the risk premium 7(.) is 
increasing. Thus, as wealth increases, the unwillingness to take risks 
increases. 


REMARK 1.5 The decreasing absolute risk aversion (DARA) can be 
characterized by the following property: denote Vo as the initial amount in- 
vested on the market and consider the random payoff X defined by X = Vo+Y 
with E[Y] = 0. Then, the risk-premium 7,,(Vo, x) satisfies: for all a > 0, 
Tul Vo, £) > Tu(VYo + a,x). This property is also equivalent to the existence 
for all a > 0 of a functional denoted by Ya(.), such that: u(x) = a! 
with W/,(.) > 0 and ©” 4(.) < 0. 














REMARK 1.6 The utility functions CARA and CRRA can be character- 
ized by invariance properties respectively with respect to multiplicative and 
additive transformations of lotteries (see for example [162]). 
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1.3.3 Applications to portfolio allocation 


Consider the following standard portfolio problem: the investor has an 
increasing and concave utility function u. He can invest his initial endowment 
in a risk-free asset M (for example, a monetary asset) and in risky asset S (for 
example, a stock or a financial index). Denote by ry and rg the respective 
returns of assets M and S. Denote by A the proportion of initial wealth Vo 
invested in the risky asset. Then, the value of the portfolio at maturity is 
given by: 


(1 = A)Vo(1 + rm) + AVo(1 + rg) = Vo + AVo(rs = rm). (1.11) 














Therefore, using the standard assumption E[rs] > rm, the expected portfolio 
return is increasing with respect to the proportion ÀA. The problem of the 
investor is to find À maximizing the expected utility: 














maxU (A) = Elu(Vo + AVo(rs — rm) )l. (1.12) 


Assuming that u is twice differentiable, the optimum X* (if it exists) is deduced 
from the first-order condition which is given by the implicit relation: 





U'(r*) = El(rs — rm u’ (Vo + AVo(rg — rm))] = 0. (1.13) 











From the concavity of u, we get: 











U” (A\*) = El(rgs — ru)?u” (Vo + AVo(rs — rm))]| < 0. 





Thus, U’ is decreasing. But 


U'(0) = u'(Vo)E[(rs — raz). (1.14) 


























Thus, from the condition E[rs] > rm, we deduce that the portfolio proportion 
X* invested in the risky asset is positive. Note that if U/’(0) = 0, then A* = 0: 
it is optimal to invest the whole endowment Vo in the risk-free asset. Thus, 
the following result is proved: 


PROPOSITION 1.2 

For the standard portfolio problem with a concave and differentiable utility 
function, the investor invests a positive proportion in the risky asset if and 
only if the excess expected return E[(rg — rm )| is positive. 














REMARK 1.7 The previous result does not depend on the level of the 
variance of the excess return. Even if the expected excess return is very 
small, it is optimal to invest part of the portfolio in the risky asset. However, 
the order of magnitude of A* will depend on this variance. Note that if the 
utility function is not differentiable, then it may happen that A* = 0, even if 
(rs — rm)| > 0. 0 
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c 


Consider the special case of HARA utility function u(x) = (b+ 2) ~. De- 
note re = (rg — rm) and A** the optimal solution when (b+ Yo) = 1. We 
have: 











A 
D 





C 





„(p nen) 
(1.15) 














= (b+ %2) “E [re (1 + A“ Ware) 














kk =c 
Therefore, since \** satisfies E [re (1 + A“ Wore) | = 0, the optimal solution 


is A* = A**(b+ Yo), for any initial endowment Vo. Thus à* is a linear function 
of the wealth Vo. Moreover, when c — co, A* is independent of wealth. 


The hypothesis of Arrow [29] is that individual preferences should be DARA 
and IRRA (Increasing relative risk aversion): 


dA(a) dR(x) 
< d 
Go ae 








>0. (1.16) 


The reasoning for DARA is that, for a given risk, wealthy investors are not 
more risk-averse than poorer ones. IRRA implies that when both wealth 
and risk increase, then the readiness to bear risk should be reduced. More 
precisely, for the previous standard portfolio problem with two assets, if the 
utility function u is twice-differentiable and exhibits DARA and IRRA, then 
the optimal proportion of initial wealth invested in the risky asset is increasing 
with wealth; but it increases less than proportionally to the increase in wealth. 


REMARK 1.8 When the two assets are risky, another risk aversion 
measure must be introduced to avoid some paradox, as shown in Ross [434]. 
For the Ross risk aversion measure, a utility function u is said to display 
higher risk aversion than the utility function v if there exists a constant > 0 
such that for all x and y, we have: 


” 1 
SAS 


y” 


(y). (1.17) 


als 


This means that inf,” (x)/v” (x) > sup,u'(x)/v'(x). This condition is equiv- 
alent to the existence of a constant a > 0 and a decreasing concave function 
G such that for all x in R, v(x) = au(x) + G(x). Finally, it is also equivalent 
to the following inequality on the risk premia: (Vo, X) > me (Vo, X) for any 
initial wealth Vo and for any lottery X where E[X] = Vo. 
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1.4 Stochastic dominance 


How can two random prospects X and Y be ranked? For a given investor, 
X is prefered to Y if the expected utility of E[u(X)] is higher than E[u(Y)]. 
But how can they be compared if the utility is not observable? Besides, we 
have seen in previous sections how a change in the utility function modifies 
the risk premium for a given lottery with payoff X. However, how does a 
change in X modify the risk premium independently of the utility function? 
The theory of stochastic dominance is introduced to answer these questions. 
It involves using the probability distributions of random prospects X and Y. 
In particular, it provides conditions under which E[u(X)] > E[u(Y)] for all 
utility functions u in a given set. Depending on this set, several stochastic 
dominance orders are defined. 

To simplify, assume that the supports of all random variables are in an 
interval [a,b]. Denote respectively by Fx and Fy the cumulative distribution 
functions of X and Y. 















































DEFINITION 1.6 X is said to dominate Y according to first-order 
stochastic dominance (“X =1 Y”) if Fx(w) < Fy (w), for all w € [a,b]. 


This definition is consistent with the expected utility, since we have: 


PROPOSITION 1.3 
X=1Y if and only if 


























b b 
MuX) =f u(wdFx(w) > Bly) = f uode), (118) 


for any utility function u which is monotonically increasing. 


PROOF 1) Assume that X >, Y. Consider any utility function u (mono- 
tonically increasing). Integrating by parts, we get: 


b 
/ u(w)|dFx (w) — dFy(w)] 


b 
= [u(2)(Fx (w) — Fy (w))] — f u'(w)[Fx (w) — Fy (w)|dw. 


The first term is equal to 0. Moreover, by assumption, u’ > 0 and Fx(w) — 
Fy(w) < 0. Consequently, 


b 
/ u(w) [dx (w) — Fy (w)] > 0, 
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which means that E[u(X)] > E[u(Y))]. 


2) Conversely, suppose that there exists an wo such that Fx(wo) > Fy (wo). 
By right continuity, there exists a neighborhood N (wọ) such that Fx(w) > 
Fy(w), for all w in N (wọ). Consider a utility function u which is constant 
outside N (wọ) and increasing inside (u’(w) = 0 for w ¢ N (wo) and u’ (w) > 0 
for w € N(wo)). Integrating by parts, we get: 


b 
f u(w)|dFx(w) — dFy(w)] = -f u'(w)|[Fx(w) — Fy (w)]dw. 


N (wo) 


Since the right side is negative, we deduce f? u(w)dFx(w) < f? u(w)dFy (w). 
Thus “X >) Y” is false. Consequently, E[u(X)] > E[u(Y)] for any utilit 
function u (monotonically increasing) implies X >; Y. i 


























In the next figure, the random prospects X and Y both stochastically dom- 
inate at the first order the random prospect Z. Since for all w, P[Z > w] 
is smaller than P|X > w] and P[Y > w], the expectation and the expected 
utility of Z are smaller than those of X and Y. 

However, X and Y cannot be compared by this criteria. Indeed, the binary 
relation ~, is only a partial order on the space of random prospects. 





FIGURE 1.2: Stochastic dominance 


From Figure (1.2), we see that the random prospect X is “less dispersed” 
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than Y: the area S(w) = f?" [Fy — Fx](s)ds between the two curves is always 
positive (for w < c, it is in [0, S1] strictly positive and, assuming —S2 < Sj, 
it is also positive for w > c). 


DEFINITION 1.7 X is said to dominate Y according to second-order 
stochastic dominance (“X =2 Y”) if JE Fx(s)ds < JE Fy(s)ds, for all w € 


[a,b] or, equivalently, the area S(w) = f. [Fy — Fx](s)ds is always positive. 


REMARK 1.9 As mentioned in Hadar and Russel [277], or in Rothschild 
and Stiglitz [435], this property is equivalent to the fact that all min-utility 
investors prefer X to Y: for all wo in [a,b], we have: 














E|Min(X, wo)] > E[Min(Y, wo)]. (1.19) 














Indeed, this latter condition is equivalent to: 


wo 


i wdFx (w) + wo(1 = Fx (wo)) > i) wdFy (w) + wo(1 = Fy (wo)), 


a 


which is also equivalent to (by integrating by parts) 


wo wo 
wo — Fx (w)dw > wo — Fy (w)dw, 
a a 
and, finally: f° Fx(w)dz < f° Fy(w)dw, for all wo in [a,b]. In partic- 
ular, this property implies that if X >2 Y, then E[X] > E[Y]. Note also 
that obviously first-order stochastic dominance implies second-order stochas- 
tic dominance but not vice-versa. 




















A complete caracterization of second-order stochastic dominance is provided 
by conditions under which E[u(X)] > E[u(Y)] for all utility functions u in a 
given set: 


























PROPOSITION 1.4 
X z2 Y if and only if E[u(X)] > Elu(Y)], for any utility function u which is 
monotonically increasing and concave. 


























PROOF 1) Assume that X >ə Y. Consider any utility function u in- 
creasing concave and twice-differentiable. Integrating by parts, we get: 


b 
I uta) dE) ae = 


b 
[u(w) (Fx (w) — Fy (w))]a — f u'(w)[Fx (w) — Fy (w)]dw. 


a 
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Since the first term is equal to 0, then: 


b b 
/ u(w) [dF (w) — dFy(w)] = -f u'(w)|Fx (w) — Fy (w)|dw. 


Integrating by parts again, we deduce: 


b 
K u(w)[dF x (w) — dFy (w)] 


b 


Tag] eosa e | [” e a 
[ro f Jefeolf 


Recall that by assumption the area S(w) = ff [Fy — Fx](s)ds is positive. 
Then we deduce: 


b 
I u(w)[dFx (w) — dFy (w)] 


b 
u” (w)S(w)dw = u'(b)5(b) — f w” (w)S(w)dw. 
Since S, u’, and —u” are positive, we get J? u(w)|[dFx(w) — dFy(w)] > 0, 
which gives E[u(X)] > E[u(Y)]. 

2) Conversely, suppose that there exists wo such that S(wo) < 0. By 
continuity, there exists a neighborhood M(wo) such that S(w) < 0, for all w 
in N(wo). Consider an increasing utility function u which is linear outside 
N (wo) and concave inside (u” (w) = 0 for w ¢ N(wo) and u” (w) < 0 for 
w €N(wo)). Then, we get: 


























b 
/ PE are E f iesin: 
a N (wo) 


Since the right side is positive, we deduce fru (w)dFx( < fou w)dFy (w). 
Thus “X >ə Y” is false. Consequently, E[u(X)] > E[u(yY 7 for a increasing 
and concave utility function u implies X >ə Y. 


























REMARK 1.10 (See Hanoch and Lévy [281]). Assume, as in Figure 
(1.2), that the two curves of Fx and Fy have only one intersection point. 
Suppose that there exists c such that: 


Fx(w) < Fy (w), for any w < cand Fy (w) > Fy (w), for any w > c. 


Then: 





X >» Y if and only if E[X] > E[Y]. (1.20) 
[ 
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In fact, for two random prospects X and Y with the same returns, the 
notion of second-order stochastic dominance is equivalent to several definitions 
of “riskiness,” introduced in [277], [281], [435], and [436]. Summing up: 


PROPOSITION 1.5 
The three following statements are equivalent: 

1) Having the same expectation, the expected utility of X is greater than 
the expected utility of Y for any utility function u which is monotonically 
increasing and concave: 


[X] = E[Y] andE[u(X)] > Eļu(Y)]. (1.21) 


2) Mean-preserving and increasing in spread: the area between the two cu- 
mulative distribution functions 


S(w) = [pw — Fx|(s)ds (1.22) 


satisfies: S(b) = 0 (ie. E[X] = E[Y]), and S(w) is always positive (i.e. 
X zY). 

3) Adding a noise: there exists a random variable € such that Y = X +€ 
and Efe|X] = 0. 
















































































REMARK 1.11 Statement (3) justifies the diversification. Consider 
n lotteries with net gains Gj,...,G, which are assumed to be independent 
with the same probability distribution. Consider any feasible strategy 0 = 
(01, ..-;9n) with weights 0; such that a 6; = 1. Consider the final wealth 


n 
x= 0G; 
i=1 
associated to the perfect diversification strategy 
n 
i=1 
and the final wealth Y associated to any strategy 0. Then the perfect di- 


versification strategy second-order dominates any alternative one: X =o Y. 
Indeed, we have: 


i=1 i=1 


Since, by symmetry, E [(@; — 1/n)G;|X] is independent from i, we deduce: 














i=l 





which implies X >2 Y by the previous statement (3). [ 
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1.5 Alternative expected utility theory 


The key property linked to the expected utility theory is the Independence 
Axiom, which may fail empirically and can yield some paradoxes, as shown 
by Allais [19]: consider three outcomes gı = 0, g2 = 100, and g3 = 500 and 
two pairs of probability distributions (P1, P2) and (Q1, Q2), defined by: 





Pi(gi) = 0%, Pi(g2) = 100%, Pi(gs) = 0%, 
Po(gi) =1%, Po(g2) = 89%, Po(gs) = 10%, 








Qi (91) Em 89%, Qi (g2) = 11%, Qi (g3) = 0%, 
Qi(g1) = 90%, Qı(g2) =0%, Qı(g3) = 10%. 





As experimental evidence often shows, usually people prefer lottery P, to lot- 
tery P2, and lottery Qe to lottery Qi, while the independence axiom implies 
that if you prefer Pı to P2, then you must prefer Q; to Qo. 


Besides some empirical violations of the Independence Axiom, the expected 
utility theory is a problem for the interpretation of the utility function u, as 
proved for example in Cohen and Tallon [125]. In fact, the utility function 
must simultaneously give a representation of the choice among outcomes, and 
also be the expression of the attitude towards risk. Therefore, for example, 
an investor who has a decreasing marginal utility u’ is necessarily risk-averse. 


To argue with this kind of criticism, several responses have been given: 


e First, following Marschak [378] or Savage [449], expected utility theory 
is what “rational” people ought to do under uncertainty, and not nec- 
essarily what they actually do. This means that if they were perfectly 
informed and aware of their decisions, they will behave according to 
expected utility theory. 


e Second, new theories of choice under uncertainty can be introduced to 
avoid, for example, Allais paradox. 


During the 80s, the revision of the expected utility paradigm has been intense- 
ly developed by slightly modifying or relaxing the original axioms. Among 
several proposals are: the Weighted Utility Theory of Chew and MacCrimmon 
[120] that assumes a weaker form of the axiom of independence; the Prospect 
Theory of Kahnemann and Tversky [314] which modifies the axioms much 
more; the Non-Linear Expected Utility Theory of Machina [367]; the Antici- 
pated Utility Theory of Quiggin [420]; the Dual Theory of Yaari [507]; and the 
Regret Theory introduced by Loomes and Sugden [365]. 
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1.5.1 Weighted utility theory 


One of the first theories that was consistent with Allais paradox was the 
“weighted utility theory”, introduced by Chew and MacCrimmon [120] and 
further developed by Chew [119] and Fishburn [224]. The idea is to apply a 
transformation on the initial probability. 

The basic result of Chew and MacCrimmon yields the following represen- 
tation of preferences over lotteries L = {(w1,p1),.-.; (Wm; Pm) }: 


U(L) = 5 u(xi)$(pi) with ¢(p:) = pi/(>_ v(x: )pil, (1.23) 


i i 
where u and v are two different elementary utility functions. 


Another approach is the “prospect theory” introduced by Kahnemann and 
Tversky [314]. Consider the following example where two lotteries are pro- 
posed. 


Example 1.1 
Most people choose A and D. Thus they violate the theory of expected utility 


TABLE 1.1: Kahnemann and Tversky example 
Problem 1 





ssume you are 300 richer than you are today. Choose between: 
A. The certainty of earning 100 
B. 50% probability of winning 200 and 50% of not winning anything 
Problem 2 
Assume you are 500 richer than today. Choose between: 
C. A sure loss of 100 
D. 50% chance of not losing anything and 50% chance of losing 200 


(the independence axiom of the theory). However, in terms of expected utility, 
the two problems are equivalent: 


TABLE 1.2: Equivalence of the two problems 


Problem 1 

Case A: 400 with prob=1 

Case B: 300 with prob=0.5 or 500 with prob=0.5 
Problem 2 

Case C: 400 with prob=1 

Case D: 300 with prob=0.5 or 500 with prob=0.5 
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In fact, the available wealth was considered after the choice has been made. 
Thus, most people behave as risk takers when facing a problem presented in 
terms of loss (Problem 2), while they behave as risk-averse when the same 
problem is presented in terms of gain (Problem 1). 


This behavioral inconsistency is called the “framing effect,” and shows that 
the mental representation of a choice problem may be crucial. 


Kahnemann and Tversky observe that “the preferences observed in the t- 
wo problems are of particular interest as they violate not only the theory of 
expected utility, but practically all choice models based on other normative 
theories.” 


The idea of the prospect theory is to represent the preferences by means of 
a function ¢ such that the utility of a lottery 


L = {(£1, p1), --, (En, Pn) } 


is given by: 
n 


U(L) = X u(ai)o(pi), (1.24) 
where ¢ is an increasing function defined on [0,1] with values in [0,1] and 


o(0) = 0, o(1) = 1. 


The function ¢(-) is a transformation of the initial probability and cor- 
responds to a decision weight functional. It allows us to take account of a 
“certainty effect.” For example, if the function ¢ is not left-continuous at 1, 
then ¢(p) < p maybe in a neighborhood of 1. This is the result of the passage 
from certitude to uncertainty. Note that the equality 7", ¢(pi) = 1 may no 
longer be true. 


Using this transformation, the Allais paradox can be solved. Moreover, 
from experimental observations, Kahneman and Tversky argue that it is nec- 
essary to distinguish positive results (gains) from negative ones (losses) from 
experimental observations. However, the sub-additivity of ø which is induced: 


Vp1,p2 €]0,1[ o(p1) + (p2) < (pı + p2), (1.25) 


may imply the violation of the first-order stochastic dominance, as well as 
other models with weighted probabilities. 


To solve this problem, alternative approaches can be proposed. 
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1.5.2 Rank dependent expected utility theory 


The “Rank Dependent Expected Utility” theory (RDEU) assumes that peo- 
ple consider cumulative distribution functions rather than probabilities them- 
selves. In this framework, it is possible to introduce preference representations 
that are compatible with the first-order stochastic dominance. 


The functional representation of preferences is defined as follows: 


DEFINITION 1.8 For all random variables X and Y which model results 
or consequences and with values in [—M, M], 


M 
X>YSVvV(X)>V(Y), with V(Z) = J u(z)d®(F'z(z)), (1.26) 
-M 
where the function u(-) is continuous and differentiable, non-decreasing and 
unique up to a non-negative linear, and ®(-) is a continuous function, non- 
decreasing from [0,1] in [0,1]. Without loss of generality, it can be assumed 
that if ®(0) = 0 and ®(1) = 1, then ®(-) is unique. 


Note that for a discrete lottery L = {(21,p1),.--;(@m,Pm)} with 
Ly S T2 L: L Tn, 


the utility V is given by: 


V(E) = Dy ule) [8E Ps) -8E p), 


; (1.27) 
= u(r) + DP a(u(ai) - ulzi) [1 - 8E- va] 

Since the weights CODAE pj) depend on the ranking of the outcomes 2;, this 
preference representation is called “rank dependent expected utility.” These 
weights are calculated by first ranking outcomes from the worst to the best 
then by summing up the utilities weighted by the sequence (O= pi) — 
Os p;))i- Thus, it is assumed that an objective probability exists, but 
individuals transform this given law by using a function of its cdf. Contrary 
to transformation defined on the pdf itself, this allows us not to violate the 
first-order stochastic dominance. 


REMARK 1.12 The RDEU is a generalization of the expected utility 
criterion (EU). Indeed, for ®(p) = p,Vp € [0, 1], the functional representation 
of preferences is given by: 


V(t) = Youle) | (Sopp) -O)| = E pule) = EUW). 


i=l 
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If ® is not the identity but u(x) = x, then the RDEU is the dual theory of 
Yaari [507]. 


As mentioned in Tallon [488], the RDEU has several advantages: 


e Contrary to the EU, the RDEU allows separation of the behavior to- 
wards wealth from the behavior towards risk. Therefore, the RDEU is 
compatible with usual empirical observations which show that individu- 
als under- or overestimate probabilities of random events (i.e., are either 
pessimistic or optimistic). 


Contrary to the EU, the RDEU allows identification of two notions 
of risk-aversions: the standard weak risk-aversion and the strong risk- 
aversion. 


Indeed, in the RDEU framework, these two notions have to be differentiat- 
ed: 

1) The weak risk-aversion of Arrow-Pratt: An individual prefers the expec- 
tation of the lottery to the lottery itself: 


VL = (£i, pst m, E(L) => pizi > L. (1.28) 
w=1 


In the ES context, it is equivalent to the concavity of the utility function. 

2) The strong risk-aversion of Rothschild and Stiglitz ([435] and [436}): 
This definition is based on the notion of mean preserving spread. Consider 
two random variables X and Y associated respectively to lotteries Lx and 
Ly. 


DEFINITION 1.9 Y is said to be a mean preserving spread of X if: 


E(Lx) = E(Ly) 
and (1.29) 
VT € [-M, M], fT Prob{X < t}dt < J"), Prob{Y < t}dt. 


























The latter condition corresponds to the second-order stochastic dominance 
as seen in Proposition 1.5. 


DEFINITION 1.10 A strong risk-averse individual prefers the mean p- 
reserving spread Y of X to X itself: Ly > Lx. 


REMARK 1.13 These two risk-aversion notions are identical in the ES 
framework. They correspond to the concavity of the utility function. In the 
RDEU context, they are distinct. 
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Chateauneuf et al. [115] show that an individual satisfying RDEU with a 
concave utility function u(-) is weakly averse to risk if an only if his trans- 
formation function ® is such that ®(p) < p, Vp € [0,1]. Furthermore, if 
u(x) = 1 — (1 — x)” with n > 1, he is weakly averse to risk if and only if 
his transformation function ® is such that ®(p) > 1—(1—p)", Vp € [0, 1]. 


Chew, Karni and Safra [121] prove that an individual is strongly averse 
to risk if and only if his utility function u is concave and his transfor- 
mation function ® is convex. 


Tallon [488] proves that strong risk-aversion allows us to give an inter- 
pretation of the RDEU: The beliefs of an individual are characterized 
by a given set of probability distributions and his utility is the infimum 
of the expectations of his utility on this set. This kind of result is also 
deduced for characterization of risk measures, as shown in Section 2.1.3 
of Chapter 2. 


[ 


Several models have been proposed in the (RDEU) framework, as recalled 
in what follows. 


1.5.2.1 Anticipated utility theory 


Quiggin [421] keeps three main properties of the ES theory: the transitivity, 
the first-order stochastic dominance, and the continuity. He has added the 
following axiom: 


DEFINITION 1.11 (Weak independence Axiom) 
Consider two lotteries 


Lx = {(x1, p1), sse) (£n, Pn)} and Ly = {(y1; p1), say (Yn, Pn)} 


such that 


and 


XS... < En and yı <... < Yn 


Vi € {1,.. n}, PIX = z;] =P[Y = yil. 


Assume that there exists a common value xi, = Yio. Consider two lotteries 
Lx: and Ly: which are equal respectively to Lx and Ly, except that £i, and 
Yio are replaced by another common value. 

The preference = is weak independent if and only if: 


Lx > Ly — Lx > Ly». (1.30) 
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PROPOSITION 1.6 


Consider a lottery L = {(21,p1),---;(@n,;Pn)}. Then a functional V which 
satisfies Quiggin’s conditions is given by: 


n a 


V(t) =Y ue) | py) - © mH] (1.31) 


i=l j= 


where ® is non-decreasing on [0,1] to [0,1], and is concave on [0,4], (®(pi) > 
pi) and conver on [5,1], (®(pi) < pi) with ®(5) = 5 and ®(1) = 1. 


As proposed in Quiggin [420], the function ® can be chosen as follows: 


spaa 9 (1.32) 
Gea 


with, for example, y = 0,6. 


REMARK 1.14 Under the previous assumptions, the first-order stochas- 

tic dominance is satisfied. Moreover, the Allais paradox is solved. Finally, the 
model is in accordance with the empirical result of Kahneman and Tversky 
[314]: individuals weight more events with small probabilities and weight less 
those with high probabilities. 


REMARK 1.15 For the special case : u(x) = x, the RDEU approach is 
equivalent to the dual theory of Yaari [507]. The implication of such preference 
representation is that no investor will diversify: in the presence of one riskless 
asset and one risky asset, either he buys only the riskless one, or only the 
risky one. However, as shown by Eeckhoudt [185], when both assets are risky, 
and in the presence of a “background risk” (such as illness, accident, fire...), 
diversification can be observed. 


Finally, different empirical experiences have shown that individuals do not 
have the same attitude towards losses and gains: the utility on losses seems 
to be convex, whereas the utility on gains seems to be concave. The value 
of each component is computed by taking the expected utility with respect 
to distortions of the distribution function which may differ for the positive 
and the negative parts of the distribution. This model can be viewed as a 
generalization of the standard rank-dependent utility model, where the same 
distortion function is used for the whole distribution. 


This kind of behavior is modelled by the “Cumulative Prospect Theory.” 
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1.5.2.2 Cumulative prospect theory 


Tversky and Kahneman [496] have introduced on one hand specific utility 
functions for losses and gains, on the other hand a transformation function of 
the cumulative distributions. There exist two functions, w~ and w* defined 
on [0,1], and a utility type function v such that the utility V on the lottery 
L = {(#1,p1),---5(@n;Pn)} with z1 < ... < £m < 0 < m41 < ~. < Tn is 
defined as Flie: define ®~ and ®t by: PI = w (pı) and $ = wt (pn), 


®, =w Sea pi) — w7 ee: pi) Vi € {2,...,m}, 


ee = Ea (1.33) 
y =w Kap) =ru baer pi) „Vi € {m+1,..,n}. 


Then, V is given by: V(L) = V7 (L) + V*(L) with 





V-(L)= Soei and Vt(L)= X` v(e). (1.34) 


i=l i=m+1 


When the probability distribution F’ has a pdf f on [-M, M], and the func- 
tions w~ and wt have derivatives w~ and wt’, then: 


M 


0 
V(L) = J v(x) w [F (x)| f (a)da + J v (x) wt [1 — F(x)| f(x)dx. 


—M 0 


As in Quiggin [420], both functions w~ and wt can be chosen as follows: 


w(p) = ——2—,, with, for example, y~ = 0,69 and y+ = 0,61. 


(p?+(1—p)7)7 



































1 
-100 -50 0 50 100 





FIGURE 1.3: Kahneman and Tversky functions 

The utility function v is convex on losses and concave on gains. The weighting 
functions wt and w7 are above the identical function for small probabilities 
and under it for large probabilities. 


32 Portfolio Optimization and Performance Analysis 


1.5.3 Non-additive expected utility 


The “Choquet expected utility” of Schmeidler [454] does not assume that 
there exists a probability to measure the likelihood of random events. The 
model is based on the so-called “Choquet Integral” (see Choquet [123]). Con- 
sider, for example, a finite probability set Q equipped with a o-algebra F. 


DEFINITION 1.12 A Choquet measure (or “capacity”) C is a function 
defined on F with values in [0,1] satisfying: 


C(0) = 0, 
C(Q) = 1, (1.35) 
VA,B € F,A C B > C(A) < C(B). 


Note that for mutually exclusive subsets A and B, C(AUB) may be smaller 
or higher than C(A) + C(B). 

The Choquet integral, fo u(x)dC(x), of a function u (piecewise constant) 
can be defined as follows: 


DEFINITION 1.13 Letu be defined on the set of outcomes of a lottery 
L = { (x1, A1), (n, An)} with z1 < ... < £n, and where for all i, the event 
A; corresponds to the outcome ti, 


T pean piu(xi), 


(1.36) 


where: 
Vi=1,---,n—-—1,¢; = C(A;U---U An) — C(Ai+1 U- -U An), bn = C(An). 


The Choquet expected utility (CEU)) of an individual is a functional rep- 
resentation of preferences based on some particular axioms. 


DEFINITION 1.14 1) A preference relation = is said to be monotonic 
if, for any random variables X and Y defined on a probability set Q: 


X(w) > Y(w) = X >Y. (1.37) 


2) Two random variables X and Y defined on a probability set Q are said to 
be comonotonic if there exist no distinct wı and wə in Q such that: 


X(w1) > X(w2) and Y (w2) > Y (w1). (1.38) 


3) A preference relation > is said to be comonotonic independent if, for all 
pairwise random variables X, Y, and Z, and for any X in ]0, 1]: 


X> Y= dX +(1-AZ>AV+(1-AZ. (1.39) 
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PROPOSITION 1.7 

Assume that the preference relation > satisfies axioms 1 and 2 (see Section 
(1.1.2)) and monotonicity and comonotonic independence axioms. Then there 
exists a unique Choquet capacity defined on the o-algebra F and an affine real 
valued function u such that : 


Lx > Ly = [oxwlace) > f roa. (1.40) 


REMARK 1.16 The term E SHE p;)] in Definition (1.27) of RDEU 


corresponds to the term C(A; U- -+ U An) in Definition (1.36) of CEU. Indeed, 
for the RDEU, the preference functional is given by 


V) = f unnevlo)dd(Pl)), 
while for the CEU, it is defined by 
VE) = f wceu(e)aC() 
[ 


This preference representation covers problems such as the Ellsberg paradox 
(see [196]). 


1.5.4 Regret theory 


Consider the following bet as examined in Lichtenstein and Slovic [356]. 
Assume that there are two lotteries Ly and Ly such that: 


Lx has outcomes (A, a) with probabilities (p, (1 — p)), 


Ly has outcomes (B, b) with probabilities (q, (1 — q)). (1.41) 


The money amounts A and B are assumed to be large and a and b are small, 
possibly negative. The main assumption is that p > q (lottery Lx has a higher 
probability of a large outcome), and that B > A (lottery Ly has the highest 
probability of a large outcome). Thus, individuals who choose lottery Lx face 
a relatively higher probability of a relatively low gain. Individuals who choose 
lottery Ly have a relatively smaller probability of a relatively high gain. For 
example: 


Lx has outcomes (30,0) with probabilities (p = 90%, (1 — p) = 10%), 
Ly has outcomes (100,0) with probabilities (q = 30%, (1 — q) = 70%). 
(1.42) 
Note that the expectation of lottery Lx is higher than that of lottery Ly. 
Many experiments (see for example [356]) have shown that individuals tend 
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to choose lottery Lx rather than lottery Ly. Furthermore, they were willing 
to sell their right to play lottery Lx for less than their right to play lottery 
Ly. Thus, whereas they prefer lottery Dx, they were willing to accept a 
lower certainty-equivalent amount of money for this lottery (25 for Lx) than 
they do for the other lottery Ly (27 for Ly). This violates the transitivity 
axiom: indeed, one is indifferent between the certainty-equivalent of a lottery 
and the lottery itself. Thus, for this example, we have: U(Lx) = U(25) and 
U(Ly) = U(27). Since U is increasing, U(27) > U(25), which implies that 
U(Ly) > U(Lx). However the empirical choice is such that U(Lx) > U(Ly), 
and the preference is not transitive. 


Transitivity is often considered as quite rational. However, we can search 
for preference representation models which can “rationally” explain this “ir- 
rationality.” Loomes and Sugden [365] proposed a “regret/rejoice” function 
for pairwise lotteries which contain the outcomes of both the chosen and the 
foregone lottery. Let Lx and Ly be two lotteries. If Lx is chosen and Ly 
is foregone and the outcome of Lx turns out to be a and the outcome of Ly 
turns out to be b, then we can consider, for example, the difference between 
the (elementary) utilities between the two outcomes, to be a measure of re- 
gret, ie. r(x, y) = u(x) — u(y), which is negative if “regret” and positive if 
“rejoice.” Individuals thus faced with alternative lotteries do not seek to max- 
imize expected utility but rather to minimize expected regret. More generally, 
the regret theory is defined as follows: Let R(.) be a regret/rejoice function 
which is assumed to be non-decreasing and such that R(0) = 0. Introduce the 
function Q given by 


Q[z] = z + Riz] — R|-2]. (1.43) 


The function Q is non-decreasing and such that Q[z] = —Q[—z]. Consider two 
lotteries Lx = (£1,..., £n) and Ly = (y1,.., Yn) with the same probabilities 
(p1,--;Pn). The expected rejoice/regret is defined by: 


DEFINITION 1.15 (see [365]) 
1) Rejoice/regret of two lotteries: 


n 


(Q(Lx,Ly)) = 5 Qlu(z:) — uly; )]pi- (1.44) 


i=l 














2) Preference on lotteries: 











Ly > Ly 4> E(Q(Lx, Ly)) > 0. (1.45) 





It is obvious that when Q is linear, the regret theory is equivalent to the 
expected utility criterion. Note that Q is usually assumed to be convex. 
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1.6 Further reading 


Fishburn [223] provides basic results about utility theory for decision-making. 
In [225], Fishburn recalls the story of expected utility theory based on specific 
axioms and examines violations of these axioms, which are at the origin of 
new theories. Gollier [258] provides an overview about expected utility theory 
and comments on its interest with respect to alternative approaches. More- 
over, applications to static portfolio optimization are detailed. Lévy [352] 
empirically examines the absolute and relative risk aversion to check Arrow’s 
assumption: “investors reveal decreasing absolute risk aversion (DARA) and 
increasing relative risk aversion (IRRA).” The purpose of the experience is to 
test these two hypotheses when the individual’s wealth varies depending on 
his/her investment performance. The empirical result is that DARA is indeed 
strongly supported, but IRRA is rejected. 


From the 1980s, and in particular during the 1990s, new theories based on 
alternative axioms have been introduced such that the Allais hypothesis would 
emerge as a result. The debate about expected or non-expected utility has 
not yet ended. A comprehensive and detailed survey of the literature on alter- 
native expected utility theories is found in Fishburn [226]. Among them, the 
non-linear expected utility of Machina [367] appealed to the notion of “local 
expected utility.” A generalization of rank dependent expected utility is in- 
troduced in Segal ([457],[458]): there is no strict separation between behavior 
towards outcomes and their probabilities (joint function of outcomes and cdf). 


Maccheroni et al. [369] introduce dynamic variational preferences. They 
generalize the multiple priors preferences of Gilboa and Schmeidler [254] who 
model ambiguity averse agents. They provide conditions under which dynamic 
variational preferences are time consistent and have a recursive representation. 


Experiments have been conducted to check that individual choices do not 
satisfy the linearity property with respect to probabilities. Schoemaker [455] 
provides several examples showing that the expected utility property is often 
violated: the choice process is examined from economic, decision theoretic, 
and psychological perspectives. Multiple factors are examined which may 
induce variance in risk-bearing: including portfolio constraints and market 
incompleteness. In the RDEU framework, the major difficulty is to estimate 
separately the utility function and the transformation function. Wakker and 
Deneffe [501] provide a method to dissociate these two estimations. In [160] 
and [161], another approach is proposed: the idea is to test the invariance 
of choices with respect to a given transformation (for example, additive or 
multiplicative). This allows a characterization of the utility functions, inde- 
pendently of the weighting of the utility. Jaffray [295] and Cohen and Tallon 


36 Portfolio Optimization and Performance Analysis 


[125] examine the individual’s behavior when he focuses simultaneously on 
the worst and best outcomes. They propose special weighting of the utilities 
of these outcomes. 


Many recent studies are devoted to behavorial finance and its application 
to portfolio management. For example, in Benzion and Yagil [53], portfo- 
lio choices are investigated experimentally. Other consequences for financial 
markets are analyzed, for instance in De Bondt and Wolff [151]. In Lévy and 
Lévy [353], it is shown that, when diversification between assets are allowed, 
mean-variance and prospect theory determine almost the same efficient sets, 
while prospect theory supposes that investors’ choices are based on change 
of wealth rather than on total wealth, and that objectice probabilities are 
subjectively distorted. 


Another approach, which has not yet been fully applied in financial mod- 
elling, is based on the discrete choice theory as detailed by MacFadden [366]. 
The idea is that individual preferences are random utility functions: one com- 
ponent is observable and deterministic. The other component, which is not 
observable, takes account of imperfections such as lack of information, indi- 
viduals’ heterogeneity, etc. The Logit multinomial model is frequently used 
to model the uncertainty about the utility function. 


Chapter 2 


Risk measures 


Recent years have seen increasing development of new tools for risk manage- 
ment analysis. Many souces of risk have been identified, such as market risk, 
credit risk, counterparty default, liquidity risk, operational risk and others. 
One of the main problems concerning the evaluation and optimization of risk 
exposure is the choice of “good” risk measures. Value-at-Risk has been in- 
troduced for bank regulation purposes. Nevertheless, due to some of its defi- 
ciencies, other risk measures have been proposed. But what axioms must be 
imposed to determine a “rational” risk measure? 


2.1 Coherent and convex risk measures 


In portfolio theory, many risk measures alternative to the variance have been 
introduced. As noted by Giacometti and Ortobelli [252], we can distinguish 
two kinds of risk measures: 


e First, the dispersion measures: these functions of risks X are increasing, 
positive, and positively homogeneous. Among them: standard deviation 
and mean-absolute deviation. Note that, if X > 0 and A > 1, then 


p(X) = Ap(X) > p(X). 
Thus, they are not consistent with the first-order stochastic dominance. 


e Second, the safety measures: these were introduced in portfolio theory 
by Roy [437], Telser [490], and Kataoka [324]. The safety-first rules are 
based on risk measures which involve the probability that the portfolio 
return falls under a given level. These kinds of measures are consistent 
with the first-order stochastic dominance. 


Among the safety-first measures, a special class has been emphasized: the 
coherent risk measures. 
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2.1.1 Coherent risk measures 


In their seminal contribution, Artzner, Delbaen, Eber, and Heath [31] in- 
troduce axioms that are based on regulation of private banks by a central 
bank. Such risk measures determine the minimal amount of capital that must 
be added to make the future value of a position acceptable. Consider the set 
Q of states of nature. This set may describe, for example, the possible values 
of all asset prices. The probabilities of each state w may be unknown. Let 
X (w) be the discounted future net worth of the position for each “scenario” 
w. Let ¥ be the set of all risks; that is a given set of real random variables 
defined on Q. The set ¥ is a linear space which may be, for instance, the set 
of all bounded random variables. 

Assume that there exists a reference asset of a constant total return r. 
Recall the axiomatics of Artzner et al. for a risk measure p defined on the set 
x: 


e Axiom T. Translation invariance: for all X in X and all real numbers 
a, we have 
p(X + a.r) = p(X) — a. 


This means that if the sure amount a is initially invested in the reference 
asset, then the variation of the risk measure is equal to a itself. This 
property is quite in accordance with a monetary interpretation of the 
measure p. Note that in particular, we have: 


p(X + p(X).r) = 0. 
e Axiom S. Subadditivity: for all Xı and Xə in X: 
p(Xı + X2) < p(X1) + p(X2). 


In particular, this property means that “a merger does not create extra 
risk.” Note also that, for a global risk X; + X2, the amount p(X) + 
p(X2) is sufficient to guarantee the position. This property does not 
allow reduction of risk by dividing the total position into smaller ones, 
which is highly desirable for regulation purposes, but does not satisfy 
the diversification principle. 


e Axiom PH. Positive homogeneity: for all X in ¥ and for all A > 0, 
P(AX) = Ap(X). 


This property implies that the risk measure is a linear function of the 
size of the position. It requires that there is no liquidity risk. 


e Axiom M. Monotonicity: for all Xı and Xə in X: 


Xı < X2 == p(X1) = (X2). 
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e Axiom R. Relevance: for all X in X with X <0 and X 40, 
p(X) > 0. 


This means that if there actually exists a risk, it must be taken into 
account. 


DEFINITION 2.1 A risk measure satisfying the axioms of translation 
invariance, subadditivity, positive homogeneity, and monotonicity is called co- 
herent. 


2.1.2 Convex risk measures 


When the risk measure is not assumed to have variations proportional to 
the risk variations themselves, the positive homogeneity is no longer satisfied. 
Alternative axioms can be proposed. This leads to the notion of convexity, 
as introduced in Heath [286] for finite sample spaces (“weakly coherent risk 
measures”) and Föllmer and Schied [234] for general spaces (see also [236]). 


e Axiom C Convexity: for all Xı and Xə in ¥ and for alO< A <1, 
p(AX1 + (1 — A)X2) < Ap(X1) + (1 — A)p(X2). 


DEFINITION 2.2 A risk measure satisfying the axioms of translation 
invariance, monotonicity, and convexity is called convex. 


PROPOSITION 2.1 
A convex risk measure is coherent if it satisfies the positive homogeneity. Note 
also that the positive homogeneity and the subadditivity implies the convexity. 


As mentioned in Artzner et al. [31], a class A, of “acceptable positions” 
can be associated to each risk measure p. This class, called the acceptance set 
of p, contains all positions X which do not induce positive risk: 


Ap = {X € ¥|o(X) < 0}. (2.1) 


Conversely, given a class A in ¥ of acceptable positions, a risk measure pA 
can be defined by: 


pa(X)=inf{meER|m+XEA}. (2.2) 


The following proposition (see [236] and [238]) indicates the relations between 
convex risk measures and the corresponding acceptance sets. 
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PROPOSITION 2.2 
If p is a convex risk measure with acceptance set Ap, then pa, = p. Besides, 
the acceptance set A= A, has the following properties: 

1) The set A is non-empty and convex. 

2) If X € A andY €X, then Y > X implies Y € A. 

3) If the risk measure p is coherent, then A is a convex cone. 

Conversely, if A is a non-empty convex subset of X which satisfies property 
(2) and such that the corresponding functional pa satisfies p4(0) > —oo, then: 

1) pa is a convex measure of risk. 

2) If A is a cone then pa is a coherent measure of risk. 


2.1.3 Representation of risk measures 


Assume that ¥ is the linear space of all bounded measurable functions 
on a measurable space (Q, F). Denote by Mı = Mı(Q, F) the class of all 
probabilities on (Q, F). Denote also by M1,¢ = M1, ¢(Q, F) the class of all 
finitely additive and non-negative functions Q on F such that Q(Q) = 1. Then 
a characterization of coherent risk measures can be deduced (see [30] and[31] 
for finite probability spaces, and [154] for general spaces): 


PROPOSITION 2.3 
A functional p is a coherent measure of risk if and only if there exists a subset 
Q of Mı, such that: 











p(X) = sup Eg[—X], X EX. (2.3) 
QEQ 





Besides, Q can be chosen such that it is convex and the supremum is attained. 


A similar representation result is obtained in [237] for convex measures of 
risk. Let a : Mı(Q, F) — RU {co} be a functional which is bounded from 
below and not identically equal to oo. Define the measure of risk p by: 


p(X)= sup (Eg[—X]—-a(Q)). (2.4) 
QEM;1ı,f 














The measure p associated to the function a is convex. 


DEFINITION 2.3 The functional a is called a penalty function for the 
risk measure p defined on Mı f. 


PROPOSITION 2.4 
Any convex measure of risk p on X has the following form: 





p(X) = QEM; ( 5 [-x] z Amin(Q)) , AX EM, (2.5) 
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where the penalty function Amin is given by: 


Amin(Q) = sup ‘o[-Y], for Q € Mı,g. (2.6) 














Additionally, the minimal penalty function Amin represents the risk measure 
p: for any penalty function a satisfying relation (2.5), a(Q) > amin(Q), for 
all QE Myf. 


REMARK 2.1 - The minimal penalty function a,j, of a coherent mea- 
sure of risk p takes only the values 0 and oo and: 














X)= max Eol-X], X eX, 
p(X) eo. el-x] 


for the convex set 


Omax = {Q = My, flamin (Q) = O}, 


which is the largest set for which the representation (2.3) holds. 
- If Q is the set of all probability measures on (Q, F), then the coherent risk 
measure induced by Q is given by the worst case measure: 














Pmax(X) = sup Eg[—X] = — inf X(w), for all X EX. 
QEQ wEQ 


- Using continuity argument such as, if Xn N X then p(Xn) 7 p(X), the 
measure of risk p can be represented as the maximum: 





p(X) = max (Eg[—X] — amar(Q)), X € ¥, 











which may be too restrictive (see [237]). I 


2.1.4 Risk measures and utility 


As detailed in Chapter 1, under some specific assumptions, the investor’s 
preference on the possible positions X can be represented by a utility function 
U. Nevertheless, the expected utility criterion, which assumes in particular 
that one single probability measure is determined, may be not appropriate 
to model the investor’s decision under uncertainty. Maybe the investor has 
in mind a larger class of measures of occurrence of scenarios, represented by 
elements of Mı,f. Then, he may consider the worst case for the expected 
losses of his investments. In this framework, under a set of axioms for his 
preference order (see [253] and [254]), there exists a Savage representation of 
U associated to a utility function u defined on the outcomes such that: 














U(X) = inf, Eo[X], (2.7) 
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where Q is a set of probability measures on (Q, F). A coherent measure of 
risk p can be associated to this functional U from the relation: 


U(X) = —p(u(X)), for all X € X. (2.8) 


As detailed in [237] (see also [303]), another approach is to associate a loss 
functional L to the utility U by letting L = —U. Then, there exists a loss 
function l, convex and increasing, defined by l(a) = —u(—«) and such that: 


L(X) = sup Eg[l(—X)]. (2.9) 
QEQ 














A position X is acceptable if the loss functional L(X) is not higher than 
a given reference level xp. Then, a convex class of acceptable positions is 
deduced: 











Ar = {X € X|L(X) = sup Eg|[l(—X)] < zr}. 
QEQ 
The set A defines a convex measure of risk pz with the representation: 


pL(X)= sup (Eg|—X] — az(Q)). 
QEM: 

















Then, the penalty function az has to be determined. As proved in [238], a 
general expression can be provided by using the Fenchel-Legendre transform 
l* of the loss function l which is defined by: 


I*(y) = p [yx — l(x)]. 


PROPOSITION 2.5 
The convex risk measure pr associated to the acceptance set Ar, has a penalty 
function az given by: 


1 dP 
P) = inf = | xr + inf Eg | | A ’ 2.10 
anf) int + (« to ol ( FID (2 
where T is a generalized density in the sense of Lebesgue decomposition: 


aL(P) < œ only if P is absolutely continuous with respect to at least some 
Q E Q (notation: P« Q). 














Example 2.1 
For the exponential loss function I(x) = e” and x, = 1, the penalty function 
az is given by: 
P) = inf H(P/Q), 211 
ax(P) = inf H(P/Q) (2.11) 


where H(P/Q) denotes the relative entropy of P with respect to Q: 


zQ [Glog | ifP<Q, 
+00 otherwise. 














A(P/Q) -f 


Risk measures 43 


2.1.5 Dynamic risk measures 


In Cvitanic and Karatzas [140], the idea of risk measures is introduced in a 
dynamic setting. They assume that a complete financial market is given and 
they define the risk of a position as the highest expected shortfall under some 
set of probability measures. The corresponding risk measure is given by: 





Xjy = inf Eg [(X —V2)]", 2.12 
(X)ve aa. o [( T)| (2.12) 











where Q is a given set of probability measures (“the possible scenarios” ), 
O(Vo) is the class of all admissible portfolios with initial endowment Vo, and 
V£ is the value of the portfolio 0 at maturity T. 


Thus, as mentioned in Frittelli [243], the dynamic property of such a mea- 
sure is due to the possible portfolio rebalancing, and not to a dynamic value 
of the measure itself. In Wang [502], the risk measure is introduced in a 
dynamic way. The class of “likelihood-based risk measures” is introduced to 
take account of dynamic readjustments (for discrete time and finite sample 
spaces). Further results can be found in [32], where two different dynamic 
measures of risk are proposed. 


In Riedel [425], the class of coherent risk measures is extended to the dy- 
namic framework by considering a predictable translation invariance to take 
new information into account, and also by introducing a dynamic consistency 
to avoid possible contradictions between judgements over time. This leads to 
the following representation: 


Consider a sequence of time periods t = 0, ..., T and a finite set Q of states 
of the world. Suppose that there exists a sequence of random variables (Z+): 
which reveals the information at time t. 


Denote by (F;) the filtration generated by the process Z: 
F; = o( Zi, ..., Ze). (2.13) 


A position D = (D)e is a Fy-adapted process, which corresponds to a se- 
quence of random payments at any time t. It is assumed that there exists an 
exogenous interest rate r. Then, under the axioms given in [425], the dynamic 
risk measure assigns to the sequence D of payments the risk: 














QeQ 1l+r) 


T D 
p:(D) = max Eg - 5 sit i (2.14) 
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The notion of dynamic risk measures can be defined as follows (see [243]): 


- First, a whole process (p;); must be introduced to represent the risk of the 
position at any time, conditionally to the information available at any time t 
along the period [0, T]. 

- Second, some boundary conditions must be satisfied. At time 0, the risk 
measure pọ is a static risk measure as defined in previous sections. Addition- 
ally, at maturity T, pr is the opposite of the worth of the financial position. 


Let (F+) be a filtration defined on the probability space (Q, F, P). Denote 
by LP = L?(Q,Fr,P) the space of all real-valued, Fr-measurable, and p- 
integrable random variables, and by L? = L°(Q, F;, P) the space of all random 
variables defined on (Q, F;, P). 


DEFINITION 2.4 A dynamic risk measure is a process (pz)_ such that: 
i) pp: LP > L}, for all t € (0,7); 
ti) po is a static measure; and, 
iii) pr(X) = —X, P — a.s., for all X € D. 


Most of the “rational” dynamic axioms that can be imposed on the process 
(p:)+ are similar to those in the static case. 


Axiom T. Translation invariance: Yt € [0,T], Va € LP and F;-measurable, 
VX eL”, 
P(X +a) = p(X) —a, P- a.s. 


Note that at time t, the random variable a can be considered as a constant 
since it is observable given the information F;. 


Axiom S. Subadditivity: Vt € [0, T], VX1, X2 € LP, 
Pl Xı + X2) < pr(X1) + (X2), P — a.s. 

Axiom PH. Positive homogeneity: Vt € [0, T], VX € L? and VA > 0, 

pr(AX) = A(X), P — a.s. 
Axiom M. Monotonicity: YX1, X2 € DP, 

Xı < X2 => Vt € [0, T], p:(X1) > (X2), P — a.s. 
Axiom Ct. Constancy: Vc € R, Vt € [0, T], 
pile) = —c, P — a.s. 
Axiom P. Positivity: VX € DP, 
X > 0 = Vt € [0, T], (X) < (0), P — a.s. 
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Axiom C. Convezity: Vt € [0, T], VX1, X2 € L’, and VA such that 0 << A <1, 
pr(AX1 + (1 — A)X2) < Ape(X1) + (1 — A) pr (X2). 


The following definitions are proposed for coherent and convex risk mea- 
sures. 


DEFINITION 2.5 (see [243]) 

i) A dynamic risk measure (pz): is called convex if it satisfies axiom (C) 
and p,(0) = 0. 

ii) A dynamic risk measure (pt)t is called coherent if it satisfies axioms (T), 
(S), (PH), and (P). 

iii) A dynamic risk measure (pz), is said to be time-consistent if it satisfies 
the following axiom: Vt € [0, T], VX € LP, VA € Fi, 


po(X 1a) = po(—pr(X)La). 


This time-consistency condition is the condition of the “filtration-consistency” 
of Coquet et al. [129] adapted to the risk measure framework. It is linked 
to the recursivity property defined in Artzner et al. [32] (not their time- 
consistency definition). 

There exists mainly two ways to provide dynamic risk measures: 

- First, by using robust representations as in the static case, as shown in 
({82], [33)). 

- Second, by relying on dynamic risk measures to backward stochastic differ- 
ential equations (BSDE) through the notion of the “conditional g-expectation,’ 
as introduced in Peng [403]. 

The first approach can be summarized into the two following results: 


b] 


Dynamic convex risk measure: Let Q be a convex set of P-absolutely 
continuous probability measures defined on (Q, F). For any t € [0,T], let 
at : Q — R be a convex functional such that infgeg a+(Q) = 0. Then, the 
process (p+)+ is defined by: Vt € [0, T], VX € L?, 














P(X) = ess. supgeg (Eol-X |F] — a(Q)). (2.15) 


This relation is the dynamic version of relation (2.5): any such process (p): 
is a dynamic convex risk measure. In addition, it satisfies the axioms (T), 
(Ct), and (P) (i.e., translation invariance, constancy and positivity). 


Dynamic coherent risk measure: Let Q be a convex set of P-absolutely 
continuous probability measures defined on (Q, F+). Then, the process (p;)¢ 
is defined by: Vt € [0, T], VX € L’, 











pr(X) = ess. supgeg (Eol-X|Fd). (2.16) 
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This is the dynamic version of relation (2.3): any such process (pz): is a dy- 
namic coherent risk measure. 


The second approach is based on the following BSDE: 


Consider a standard d-dimensional Brownian motion (By), and denote 
(Fi), the augmented filtration generated by B. 
Denote by L% = L%(T,R") the space of all R"-valued, adapted processes 0 
such that: 


E 





T 
J teaa] > 2.17) 
0 


where ||.||n stands for the Euclidean norm on R”. 


Let g : [0,7] x R x R? — R be a function which satisfies usual conditions 
to guarantee existence and uniqueness of the solution (Y, Z) of the following 
BSDE (see [403] or [129]): 


—dY, = g(t, Yı, Zi)dt — ZpdB,, 0 < t < T and Yr = X. 
The notion of g-expectation introduced in Peng [403] is as follows: 
DEFINITION 2.6 For any X € L? and for allt € [0,T], the conditional 
g-expectation of X under F;, denoted by Eg|X|F:] is defined by: 
EXIF] = Y;, (2.18) 


where Y is the first component of the solution of the previous BSDE with value 
X at maturity T. For t= 0, €,[X] = Yo is called the g-expectation. 


Then, a risk measure can be defined as a family of maps from to L? to L4. 
Consider the process (p;); given by: for all X € L?, 


p(X) = E| -XIF (2.19) 


From the representation of risk measures by conditional g-expectations, suffi- 
cient conditions of convexity and coherence can be deduced (see [243]): 


PROPOSITION 2.6 

i) If the functional g is convex in (y,z) € (R x R2), then the process (pz)¢ 
defined in (2.19) is a conver dynamic risk measure. It is also time-consistent 
and satisfies axioms (T), (P), and (Ct). 

ii) If the functional g is sublinear in (y, z) € (R,R®), then the process (p+)+ de- 
fined in (2.19) is a coherent dynamic risk measure. It is also time-consistent. 
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Conversely, sufficient conditions can be given to prove that a given risk 
measure is associated to a g-expectation. 


For this purpose, recall the definition of €,,-dominance given in [129]: the 
measure po is €,-dominated if for any X; and X2 in L?, 


po(X1 + X2) — po(X1) < &,,[—-Y], for some g, = plz|, with wp >0. (2.20) 


Assume that d = 1. Then the following result can be deduced: 


PROPOSITION 2.7 
Consider a dynamic convex risk measure (pi) on L? satisfying translation 
invariance (T) and monotonicity (M). 
1) If: 
i) (pt)e is time-consistent; 
it) po is strictly monotone, i.e.: 


Xı > Xə and po(X1) = po(X2) = X= Xo; (2.21) 
iii) and, po is E,,-dominated, 


then there exists a unique functional g : Q x [0,T] x R —> R, independent of y, 
satisfying usual conditions, such that |g(t,z)| < |z| and associated to p by: 
for all X in L?, 


po(X) = €,(—X) and p(X) = E,[—X|Fi]. (2.22) 
Besides, if g is P-a.s. continuous in t for all z E€ R then g is convex in z. 


2) Consider a dynamic coherent risk measure (p;), on L? satisfying properties 
(i), (it), and (iii). Then there exists a unique functional g, independent of y, 
satisfying (2.22). Also, if g is P-a.s. continuous in t for all z € R, then g is 
sublinear in z. 


In [44], a general axiomatic of dynamic convex risk measures is related 
to the notions of consistent convex price systems and non-linear expectations 
introduced in El Karoui and Quenez [194] and Peng [403]. Then their dynamic 
risk measures (so-called “market risk measures” ) are also related to quadratic 
BSDE. 
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2.2 Standard risk measures 
2.2.1 Value-at-Risk 


The need to introduce new specific risk measures, alternative to dispersion 
measures such as standard deviation, has been implied by the regulation of 
financial institutions and the development of risk management. The Value- 
at-Risk (VaR) has been the first attempt to provide a quantitative tool to 
determine capital requirements (see for example JP Morgan [393]), and to 
better take account of fat-tailed and non-normal returns; for example, when 
volatilities are random and possible jumps may occur, when financial options 
are involved in the position, when default-risks are not negligible, when cross- 
dependence between assets are complex, etc. 


2.2.1.1 Definition of the VaR 


To illustrate the VaR concept, consider the following model: 


Example 2.2 

Let a = (a1, ..., ay )' be a portfolio allocation, where a; denotes the allocation 
invested on the i-th financial asset. Denote by 5; the price of the asset i at 
time t. Then, the portfolio value V(a) is given by: 


N 
Vi (at) = 5 Qi tit = a’ Si. 
i=1 
Then, its change between dates t and t + 1 is equal to: 
N 
AVi41(at) = (Vii — Ve) (at) = 5 dit (Si+ — Sit) = aA Siya. 
i=1 


Assume that the future asset prices S; = (S1,t,..-,SN,t) has a continuous 
conditional distribution Ps, 4 given the information at time t. Then, for a loss 
probability level a, with usual values between 1% and 5%, the Value-at-Risk 
VaR alar) is defined by: 


Ps, (Vita — Ve) (az) + VaRi (at) < 0] =a. (2.23) 


Thus, the VaR corresponds to a reserve amount added to the portfolio value, 
such that the probability of global loss is small (equal to the level a). The 
VaR depends on the information available at current date t, on the conditional 
distributions of financial assets, on the portfolio weights, and on the loss 
probability level a. 
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It can be also viewed as an upper quantile at level (1 — a) of the potential 
portfolio loss since: 


Ps,.t (V: _— Vi41) (at) > VaR alar)] =a. (2.24) 


Assume that returns are normally distributed. Denote by pz and >; its con- 
ditional mean and covariance matrix. Then, from equation (2.24), the expres- 
sion of VaR is deduced: 


VaR alar) = —a', pt + (Gl Sau)? qia, (2.25) 


where qi— is the quantile of level (1 — a) of the standard normal distribution. 
Thus the VaR is the sum of conditional expected negative returns and their 
standard deviation, weighted by a positive constant qi_q for a given (small) 
threshold a. Therefore, as seen in Chapter 3, minimizing VaR for the Gaussian 
case is equivalent to searching for a mean-variance efficient portfolio. 


For a general position X (corresponding to a P&L), upper and lower Value- 
at-Risk can be defined at a given level a (see for example Acerbi [2]). Denote 
by Fx the cumulative distribution function (cdf) of the random variable X 
(Fx (a) = P[X <a]). Let ga(X) and q° (X) be respectively the lower and 
upper quantiles of X. 


DEFINITION 2.7 The lower a— Value-at-Risk (usual VaR) is given by: 
VaRo(X) = — sup {x |F x(x) < a} = —qa(X). 
The upper a— Value-at-Risk is given by: 
VaR (X) = — inf {x |F x(x) < a} = —q°(X). (2.26) 


REMARK 2.2 Obviously, when the cdf of X is continuous and strictly 
increasing, the two quantiles are equal. However, for discrete distribution 
functions, they may be distinct. Consider for example a P&L X with possible 
negative values: —n% for —n in E = {—100,—95,...,—5,0} with respective 
probabilities p-n. Then, for an = >, 109 Pk: 


VaRa,(X) = n% and VaR°"(X) = (n+ 1)%. 


REMARK 2.3 As mentioned in [5], note that these definitions are am- 
biguous. For example, a 5% lower VaR. is the amount that the position may 
lose in the best of the 5% worst cases (and may gain in the worst of the 95% 
best cases). Thus, from the loss point of view, the VaR underestimates the risk 
at a given level a, since it provides only the smallest loss at this probability 
level. 
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FIGURE 2.1: Probability distribution (pdf) and cumulative distribution 
(cdf) with VaRsy, = 13.5%. The 5% worst cases (resp. 95%) are shadowed 
in dark (resp. light) grey. In the cdf plot, VaR is the opposite of the abscissa 
of the intersection point of the cdf with the horizontal line a = 5%. 


Figure (2.1) illustrates the determination of the VaR for a probability dis- 
tribution having a density (pdf). For a given threshold a, the VaR, is the 
opposite of the quantile qq of the distribution: the highest (“best”) value such 
that the probability to be under this value is smaller than a. 


Figure (2.2) shows the pdf of returns and the pdf of returns conditional on 
exceeding the VaR. Two statistical models are considered for modelling the 
position: the Gaussian distribution, which has thin tails, and the Pareto-Levy 
distribution, whose “fat” tails decrease by a power. The two distributions have 
parameters such that they have the same VaR. 


The Gaussian distribution is assumed to be centered and its standard de- 
viation is chosen equal to 1%. In this case, the VaR. corresponding to a 99% 
probability of overshoot is equal to 2.33% of the value of the position. 


The Lévy-stable distribution is centered on zero and is symmetrical. Its 
characteristic exponent is equal to 1.5 (which determines the decrease in the 
distribution tail). The value of the scale parameter is equal to 1. It indicates 
the dispersion of the distribution. Thus the VaR of the stable Paretian dis- 
tribution for a 99% probability of overshoot is also equal to 2.33%. Note that 
the values of the losses (lower than VaR) are near the VaR for the Gaussian 
distribution contrary to the stable Paretian case. 


Risk measures 51 










Probability p = 99% 





Density distributions 
of the returns 
exceeding the VaR 


VaR=2.33% 


Density distributions 
of the returns 


-10% -5% 0% 5% 10% 


FIGURE 2.2: Level of returns for Gaussian and Stable Paretian 
distributions with same VaR 


2.2.1.2 Convexity of the VaR 


While the VaR satisfies translation invariance (T), positive homogeneity 
(PH), and monotonocity (M), the VaR fails to be convex for particular prob- 
ability distributions. More precisely, it may not satisfy the subadditivity 
property, as shown in the following example provided by Danielsson et al. 
[144]: 


Example 2.3 
Consider two financial assets A and B having Gaussian probability distribu- 
tions but with independent “shocks”: 


0 with probability 0,991, 


EEEN eee) sod { —10 with probability 0,009. 


Then VaR(A) at the level 1% is equal to 3.1. Suppose that B has the 
same probability distribution as A but is independent from A. In that case, 
VaR(A + B) at the level 1% is equal to 9.8, since for the asset A+ B the 
probability of getting —10 draw from A or B is higher than 1%. 

Then we have: 


VaR(A+ B) = 9.8 > VaR(A) + VaR(B) = 3.14 3.1 = 6.2. 


[ 


Nevertheless, it may be subaddtive (and thus convex) for other probability 
distributions (see [30] and [144]): 
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Example 2.4 

If the joint distribution of (X,Y) is Gaussian, then the VaR is subadditive 
for usual values of the level a. Indeed, as seen previously previous in Example 
(2.2), for a normal random variable Z, the VaR satisfies: 











VaRa(Z) = —(E[Z] + qao (Z)), (2.27) 





where qa is the quantile of the standard Gaussian distribution and is negative 
since a is smaller than 0.5. Therefore, VaRa(Z) is an increasing function of 
the standard deviation o(Z). Since for any (X,Y), o(X +Y) < o(X)+0(Y), 
the subaddivity property is deduced. 


2.2.1.3 Sensitivity of the VaR 


The sensitivity of VaR previously in Example (2.2) has been examined in 
Gourieroux et al. [261], under the following assumptions: the increment of as- 
set prices Yi41 = ($1,441—$1,3..-5. SN,t+1 — SN,t) has a continuous conditional 
distribution with positive density and finite second-order moments. 


PROPOSITION 2.8 

Consider the portfolio with value V (a) given by Vi (az) = a’.S;. Under previ- 
ous assumptions, VaR alat) has a first-order derivative with respect to the 
allocation a given by: 


VaR alar) 
Oa 


Denote by V; the conditional variance and by ga, the conditional pdf of 
a’Y¥i41. The second-order derivative of VaR is equal to: 











= -E [Yi la’ Yig4 = —VaRi,a(at) | i (2.28) 





8? VaR alar) _ Log [gat] 
Oada! E Ox 


o 
2 {Èv Paas -x)} , (2.29) 
T z=VaRt,a (at) 


(—VaRz,a(ae)) Ve [Yi la Yi = -VaR alat )] 


PROOF (see [261] for more details) 
1) The VaR is defined from relation: 


Ps, t |Ait+1 + ait Bit+1 > VaRialat)] =a, 


where Aity = — J ji 0j,tYjt+1 and Bit = -Yi t+. 
Besides, the following property holds: if (A, B) is a bivariate continuous vec- 
tor, then the quantile q(b, a) defined by P [A + bB > q(b, a)| = a, has a first- 
order derivative with respect to b given by: 








(b,a) = E[B |A +bB = q(b,a)]. 











o 
ao 4 
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Using this result, the first-order derivative of the VaR is deduced. 

2) The second-order derivative is determined from a first-order expansion 
of the first-order derivative around a given allocation ag (see [261], Appendix 
B). 


REMARK 2.4 For the Gaussian case, the second-order derivative of the 
VaR. is equal to 
8? VaR alar) qi—a 


pens 1 Sey; 
aaa T Wan [Yi la Yin = —VaRtalat)], 


where q1—a is the quantile of level (1 — a) of the standard normal distribu- 
tion. Thus, as soon as @ < 0.5, the second-order derivative is positive which 
implies the convexity of the VaR. This result can be extended, for example, 
to a Gaussian model with unobserved heterogeneity for which the probability 
distribution is a mixture of Gaussian distributions (see [261]). 


2.2.1.4 VaR estimation 


VaR estimation is obviously based on quantile estimation and tail analysis. 
First, parametric methods have been developed, generally using a Gaussian 
assumption on the joint distribution of asset returns (see e.g., JP Morgan 
Riskmetrics). Second, to better take account of fat tails, non-parametric 
approaches have been introduced to determine the historical VaR which is an 
empirical quantile (see e.g., [216], [283], [308]...). Semiparametric methods, 
based in particular on extreme value approximation for the tails have been 
proposed, by Bassi et al. [49] and McNeil ((380], [381]). 

When observations are independent and stationary (iid), non-parametric 
methods based on kernel estimators can be introduced as in [261]. Consider 
the sequence of returns (Z;), defined by: for any asset i, 


Zit = (Speer — Sit) (Si (2.30) 


Let 0 be the vector of allocations measured in values instead of shares. Then, 
the VaR is defined by: 


Ps,,t [—0:.Z441) > VaRt,a(at)| =a. (2.31) 
From iid assumption, this is equivalent to: 
P (—0:.2441) > VaR alar)] =a. (2.32) 


Therefore, it can be consistently estimated from T observations by using a 
Gaussian kernel. The estimated VaR, denoted by VaR, is solution of: 


2 yy jp A rer), (2.33) 


54 Portfolio Optimization and Performance Analysis 


where N is the cdf of the standard normal distribution, and h is the selected 
bandwidth. 

Equation (2.33) is solved numerically by a Gauss-Newton recursive algo- 
rithm. Let var) be the value of the approximation at step p, then: 


DN “ DÂ Z= a) 


var +) = yar?) + —= 


- ri 
L 0! Z: var 


where f is the pdf of the standard normal distribution. 

The starting value var‘) can be chosen equal to the VaR calculated under 
a Gaussian assumption on the historical VaR. As mentioned in [261], this 
method is quite appealing and can be applied for large portfolios. 


ae 


2.2.2 CVaR 


As it can be easily seen, VaR is a risk measure that only takes account of the 
probability of losses, and not of their sizes. Moreover, VaR is usually based 
on the assumption of normal asset returns and has to be carefully evaluated 
when there are extreme price fluctuations. Furthermore, VaR may be not 
convex for some probability distributions. Due to these deficiencies, other 
risk measures have been proposed. Among them, the Expected Shortfall (ES) 
as defined in Acerbi et al. [3], also called Conditional Value-at-Risk (CVaR) 
in [427] or TailVaR in [31]. Note that in Acerbi and Tasche [5], several risk 
measures related to ES are considered and the coherence of ES is proved. 


2.2.2.1 Definition of the ES 
The ES can be expressed as follows (see [2]): 


DEFINITION 2.8 Let Fy be the generalized inverse of the cdf Fx of X 
defined by: 

r 

Fx (p) = sup {z |Fx (x) < p}. 


Then, the ES is defined as the average in probability of all possible outcomes 
of X in the probability range 0 < p < a: 


ESa(X) =— | Fe (dp. (2.34) 
0 


Then, for continuous cdf, the ES is given by: 











ESo(X) = —(E[X |X < do(X)]), (2.35) 





where qa(X) is the quantile of X at the level a. 
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REMARK 2.5 Figure (2.2) illustrates how two probability distributions 
can have the same VaR, whereas one is thin-tailed (the Gaussian case) and 
the other is fat-tailed (the Pareto-Lévy case). In this example, the VaR cor- 
responding to a 99% probability of overshoot is equal to 2.33% of the value 
of the position. Nevertheless, the comparison of the expected shortfalls shows 
the different risk levels. 


As mentioned in Longin [362], for the standard Gaussian distribution, the 
ES is approximatively equal to VaR+ $74, (here equal to 2.66%). Thus, the 
higher the VaR, the nearer the possible losses beyond the VaR. For a stable 
Paretian distribution, with location and skewness parameters equal to 0, with 
characteristic exponent c > 1, the ES is approximatively equal to VaR+ Yaz 
(here equal to 3.22%). Therefore, the higher the VaR, the wider the spread 
of the losses beyond the VaR. 





REMARK 2.6 The expected shortfall ES is also equal to: 











ESa(X) = -~ [E[X1px<qa00)] ~ da(X) PIX < qa(X)]—0)]. (2.36) 





Thus, when Fx (x) has a jump at a, the term qa(X)(P[X < qa(X)] — a) 
must be substracted to the expected value E[X1x<q,(x)}] - 














REMARK 2.7 When there are discontinuities in the loss distribution, 
the a—tail conditional expectation TCE, is defined by: 





TCE,(X) = —(E[X |X < ga(X)]), (2.37) 











is not always equal to ES,(X). In that case, it may be not coherent (not 
subadditive) as shown in [2]. 


The evaluation of marginal impacts of positions on risk measures and reg- 
ulatory capital is a key point for risk management analysis (see e.g., example 
Jorion [308}). 


When sensitivities are known, the global portfolio risk can be decomposed 
component by component. Then, the largest risk contributions can be iden- 
tified. Additionally, it is no longer necessary to recompute the risk measure 
each time the portfolio composition is slightly modified. 


The sensitivity analysis of expected shortfall and its (non-parametric) es- 
timation has been examined in Scaillet [450]. Estimators for both expected 
shortfall and its first-order derivative are provided and their consistencies 
proved. 
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2.2.2.2 Sensitivity of the ES 


PROPOSITION 2.9 
Consider the portfolio with value V(a) given by V;(az) = a’.S;. The expected 
shortfall ES: alat) has a first-order derivative with respect to the allocation a 
given by: 

OES: alat) 














Ja = —E; [Yii la’ Vig > —VaRi,a(at) | $ (2.38) 
For the Gaussian case, the expected shortfall is given by: 
ESti alar) = —a,, -ht + Cin) UIE (2.39) 


where f is the pdf of the standard Gaussian Law. Thus, its first-order deriva- 
tive is equal to: 


(2.40) 


PROOF (See [450]). The proof is based on the following property: consid- 
er the loss function Lı = X —cY where € is a positive real number and (X,Y) 
is a random vector admitting a continuous conditional density with respect to 
the Lebesgue measure. Then the first derivative of expected shortfall is equal 
to: 





OVaR(L1)/de = -E|Z|Lı > VaR(Lı)]. (2.41) 


[ 











Note that for the Gaussian case, the first-order derivative is an affine func- 
tion of the expected shortfall itself since we have: 


OE Si alat) = Dt. Qt 1 
=o = pet (ai. 34.04) (E St alat) +a t) : (2.42) 


2.2.2.3 Estimation of the ES 


This analysis has been mainly based on estimation of the mean excess func- 
tion in extreme value theory (see Embrechts et al. [201] or McNeil [380]). 
Thus, data have been assumed to be iid and usually univariate. In Scaillet 
[450], the estimation method is based on a kernel approach. This allows us to 
take account of more general dependencies, such as strong mixing (as defined 
for example in [169]). 

Assume that the data have been generated by a sequence of random vari- 
ables (Y;)¢ which models the “risks” along the time period. The process Y is 
supposed to be strictly stationary. It may be, for example, a VARMA or a 
multivariate GARCH process. 
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Consider the following Kernel estimator: 


T: 
(Yi, at); €] = (Th YO YLK (E — ar'Ys/h). (2.43) 
Introduce also: 
£ 
P= f Yo Yi); udu. (2.44) 


The bandwidth h is assumed to be a function of the number of observations 
T which converges to 0 as T goes to infinity. The kernel estimator g(a, œ) of 
the quantile —VaRa(a) is determined from the relation: 


q(a,cr) 
[(1, ai ¥%:); uldu = a. (2.45) 


^A 


Then, the ratio I[@(a, œ)]/a is an estimator of the conditional expectation 
EY: |Y; < —VaR.(a)]. Therefore an estimator of the ES is given by: 














m = —a’.I[G(a, a)]/a. (2.46) 
Similarly, an estimator of the first-order derivative of the ES is given by: 
m = Tjâla, a)]/a. (2.47) 


Note that M is simply equal to Mı multiplied by the allocation a. Under mild 
assumptions, asymptotic normality of the estimators are proved in [450]. 


2.2.2.4 Numerical computation of the ES 


Often, the dependence between asset prices and their particular payoffs 
implies that no tractable analytical expression of the portfolio distribution is 
available. In such case, Monte Carlo simulations can be used. Consider N 
independent simulations X; of the same position value X. They allow the 
simulation FX of the cdf X defined by: 


N 
1 
FX (a) = N > lcay (2.48) 
{=l 


Let us introduce the ordered statistics (Xini i.e., the values Xin are 
equal to the values X; sorted in increasing order: 


Xin <.< Xn. (2.49) 
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Denote by ESN (X) the a—expected shortfall of the simulated distribution. 
For any real number g, denote by |x] the integer part of x. Then: 


Na 
ESX (X) = -5 o Zan) : (2.50) 


i=l 


REMARK 2.8 Note that we have also: 


[Na] 
1 = = 
ESN(X) === | Ð Xi + Na- (Nal) Šinano 251) 
i=1 


For example, consider a sample of about 250 daily outcomes of a financial 
return, observed over one year. Then, with a = 5%, Na = 12.5 and the 
a—expected shortfall of the estimated distribution is equal to 


12 
1 x z 
ESŽ?(X) = STT (> Xi,250 + (0.5) Zaz) . 
f i=1 


Thus, the last term (0.5) X14 260 is not negligible. I 


Using results about convergence of ordered statistics, the following result is 
proved: 


PROPOSITION 2.10 
The a—expected shortfall of the simulated distribution ESN (X) converges al- 
most surely to the a—expected shortfall ESa(X) of the true distribution. Note 
that when the lower and upper a— Value-at-Risk are equal, the convergence 
of the a— VaR holds: VaRX(X) = —Xinaj+i,n converges to VaRg(X) = 
VaR*(X). 

However, if VaRa(X) 4 VaR*(X), then VaRN(X) does not converge as 
N converges to infinity, but flips between VaR,(X) and VaR*(X). 


In practice, it is necessary to also use explicit estimators of the first or- 
der derivatives of the risk measure. These derivatives are also of particular 
relevance in the portfolio selection problem. They help to characterize and 
evaluate efficient portfolio allocations when VaR and ES are substituted for 
variance as measures of risk. In fact, numerical constrained optimization al- 
gorithms for computations of optimal allocations usually require consistent 
estimates of first order derivatives in order to converge properly. Therefore, 
it is necessary to search for risk measures which are not only “rational,” but 
also easily estimated and computed. This is one of the reasons to consider 
the spectral risk measures. 
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2.2.3 Spectral measures of risk 


The expected shortfall is a “natural” coherent extension of the Value-at- 
Risk. Both are based on quantiles of positions X. As seen in relation (2.34), 
the ES is an average of the outcomes of X with respect to a particular prob- 
ability which is uniform on the interval [0,a], i.e., it assigns equal weights 
dp/a to the worst 100a% and zero to the others. 


2.2.3.1 Definition of spectral risk measures 


More general averages of type can be introduced as in Acerbi [1]: 


5 / $(p) Fx (p)dp, (2.52) 
0 


where ¢ is a weighting function defined on the set of quantiles. In [1], the 
function ¢ is called risk spectrum or risk aversion function, associated to the 
measure My. For the ES, the function ¢ is given by: 


1 + ifp<a, 

PES. (p) = gle = { Oelse (2.53) 

As any convex combination of coherent measures of risk is clearly coherent, 

we can search for the convex hull of a given set of such measures, among them 

the family of ES, indexed by the level a € [0,1]. In Acerbi [1], any measure 

in this convex hull is called spectral risk measure and the following result is 
proved: 


PROPOSITION 2.11 
Any spectral risk measure is of the type: 


Mge(X) = cBS0(X) — (1-0) / $(p) Fx (p)dp, (2.54) 
0 


with c € [0,1] and ¢: [0,1] — R satisfying the following conditions: 
1) Positivity: ¢(p) > 0 for all p € [0,1]. 
1 
2) Normalization: J o(P) dp = 1. 


3) Monotonicity: sp) > b(p2) for all 0 < pı < po < 1. 
The measure Mg,c is coherent if and only if these assumptions are satisfied. 
Exponential risk-aversion functions ¢ can be used: 


e7P/Y 


(p) = 7 ern) 


, y € (0, +00). 
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The smaller the parameter y, the steeper the risk-aversion function ¢, which 
allows us to better take account of the left tail of the distribution. 


2.2.3.2 Characterization of spectral risk measures 


The set of spectral risk measures does not contain all coherent risk mea- 
sures. To characterize spectral risk measures, comonotonic additivity, and 
law-invariance have to be introduced as in [342] and [489]. 


DEFINITION 2.9 1) (Comonotonicity) Two random variables X and Y 
are said to be comonotonic if they are non-decreasing functions of the same 
random variable Z : 


X = f(Z) and Y = g(Z), with f and g : R —> R non-decreasing. 


2) (Comonotonic additivity) A measure of risk p is said to be comonotonic 
additive if, for any comonotonic random variables X and Y: 


AX +Y) = p(X) + oY). (2.55) 


REMARK 2.9 The comonotonicity is a very strong dependence property 

between two random variables. In particular, if financial asset returns are 
comonotonic, they cannot hedge each other and diversification is inefficient. 
Thus, in that case, property (2.55) is rather intuitive. 


The law-invariance property is also “natural”: 


DEFINITION 2.10 A measure of risk p is said to be law-invariant if, 
for any random variables X and Y having the same probability distributions: 


p(X) = p(Y). (2.56) 
This means that the measure of risk p is a functional defined on the set of cdf: 


P(X) = (Fx [.))- 


REMARK 2.10 As mentioned in [2], a law-invariant measure of risk gives 

the same value to two empirically indistinguishable random variables. Thus, 
a measure p which is not law-invariant cannot be estimated from empirical 
data. 


PROPOSITION 2.12 
(see [2]) The class of spectral risk measures is the set of all coherent risk 
measures which are comonotonic additive and law-invariant. 
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This result is in favor of the spectral risk measures, since both conditions 
are quite rational. Basic examples of coherent measures of risk which do not 
satisfy one of these conditions are: 


e The risk measure, introduced by Fisher [228], defined by: 














Pra = -E[X] + ao; (X), (2.57) 


where 0 < a < 1, and where the one-side pth moment is given by: 




















oz (X) = ~/E[Maa(0, ŒX] — XDP]. (2.58) 








For a Æ 0, this measure is not comonotonic additive. Note that, for 
a= 1 and p = 2, the minimization of this measure is equivalent to the 
mean-semivariance analysis. 


e The worst conditional expectation (WCE) defined in Artzner et al. [31]: 





WCE,(X) = — inf {E[X|A]: A € A,P[A] > al}, (2.59) 











where A is a given o-algebra. This measure takes the infimum of condi- 
tional risk measures only on events A with probability larger than the 
level a. Thus, generally it is not a law-invariant risk measure, except 
when the probability space is non-atomic (in that case, it is equal to the 
expected shortfall ES). 


As seen in Chapter 1, the notion of stochastic dominance (in particular of 
the first-order) allows the comparison of two random financial positions by 
ordering their probability distributions. Thus, it is interesting to search for 
risk measures which are compatible with stochastic dominance. 


DEFINITION 2.11 A map p: X —> p(X) ER is said to be monotonic 
with respect to first-order stochastic dominance if it satisfies: 


P(X) 2 pY) if Y zı X. 


Note that if Y =, X, then X always has a left tail higher than Y: VaRa(X) > 
VaRa(Y) at any level a. Therefore, the previous monotonicity property is 
rather desirable. 

Acerbi and Tasche prove that spectral risk measures are characterized by 
such property: 


PROPOSITION 2.13 

The class of spectral risk measures is the set of all coherent risk measures which 
are comonotonic additive and monotone with respect to first-order stochastic 
dominance. 
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2.2.3.3 Estimation of spectral risk measures 


The ordered statistics (X; y); allow definition of a consistent estimator 
MJX) of the spectral risk measure My,-(X), as follows: define by ESN (X) 
the a—expected shortfall of the simulated distribution by: 

i/N 


N 
MẸ X) =i 2 Kind: with ĝi = eR o(p)dp. (2.60) 


Then the sequence (m, SeX yy converges almost surely to the spectral 
risk measure My -(X). 


2.3 Further reading 


Szegö ([484] and [485]) provides a general overview of risk measures and 
their applications to regulation, capital allocation, and portfolio optimization. 
Time-consistency for preferences has been introduced in Koopmans [332]. For 
further contributions, see Epstein and Schneider [211], Epstein and Zin [212], 
and Wang [502]. Examples and characterizations of time-consistent coher- 
ent risk measures are also given in Weber [503] and Cheridito et al. ([117], 
[118]), where coherent and convex monetary risk measures are defined on the 
space of all cadlag processes that are adapted to a given filtration, bounded or 
non-bounded. In Barrieu and El Karoui [43], infconvolution of risk measures 
are introduced and applied to examine optimal risk transfer. When infor- 
mation is partial or asymmetrical with no a priori given probability, robust 
representation of convex conditional risk measures can also be deduced as in 
Bion-Nadal [71] (see also Detlefsen and Scandolo [165] in a dynamic frame- 
work). Law-invariant risk measures are examined in particular in Kusuoka 
[342]. Vector-valued coherent risk measures are introduced in Jouini et al. 
[311] to take account of difficulty in aggregating portfolio positions due to 
liquidity problems or transaction costs. Another approach to generate risk 
measures is proposed in Gooverts et al. [259]. This is based on insurance 
premium principles and uses the Markov inequality for tail probabilities. It 
does not always lead to coherent risk measures. This point is further analyzed 
by Denuit et al. [159], who generate a large class of risk measures from the 
actuarial equivalent utility pricing principle. 


As examined in Wilson [504], three main methods can be used to determine 
the VaR: 


e The Variance/Covariance method which assumes normal probability 
distribution of returns (thus with deterministic volatilities). This as- 
sumption makes easy its implementation. It is easy to interpret, to 
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compute, and to implement. However, it does not capture the fat tails 
of returns. Besides, the problem of non-linearity of derivatives payoffs 
has to be (partially) solved by a Delta/Gamma approximation. 


e The Historic Simulation based on past data which are generated by 
“true” distributions. Fat tails and non-linearity can be taken into ac- 
count. Its interpretation is straightforward, but the use of large numbers 
of data can be time-consuming. Moreover, the implicit assumption that 
past data are “good” predictors of future returns is not always valid. 


e The Monte Carlo Simulation which has the purpose to generate values 
of the position from specific assumptions on returns and investment 
strategies. This method can be applied for quite general distributions 
and for different time periods. It can take non-linearity into account. 
However, its interpretation and computation are more involved. 


Duffie and Pan [176] also provide a broad overview of VaR, in particular in 
the presence of market risk: asset returns have Markovian stochastic volatili- 
ties (GaRCH type, with possible regime-switching, etc.), and/or are modelled 
by jump diffusions. Factors models are also considered when the portfolio of 
positions has a market value which is sensitive to risk factors variations, such 
as major equity indices and treasury rates. A Delta/Gamma approach is also 
proposed to simplify the calculation methods, in particular when using Monte 
Carlo simulations for large portfolios. 


The first derivative of the VaR and the ES with respect to portfolio allo- 
cation can be also derived when netting between positions exists, as in credit 
risk management (see Fermanian and Scaillet [221]). 


Conditional parametric estimations of VaR and CVaR based on GARCH 
modelling of financial time series are examined in Alexander and Tasche [16], 
in McNeil et al. ([380], [381]), and Engle and Manganelli [207] propose a quan- 
tile estimation that does not assume normality or iid returns. They introduce 
a conditional autoregressive value at risk so-called CAViaR model where the 
evolution of the quantile over time is modeled by an autoregressive process. 


For other detailed analyses of Value-at-Risk and its application to risk man- 
agement, see Jorion [309], Dowd [170], and Stulz [482]. 


Part II 


Standard portfolio 
optimization 


“The rule that the investor does (or should) maximize discounted expected, 
or anticipated, returns...is rejected both as a hypothesis to explain, and as 
a maximum to guide investment behavior. We next consider the rule that 
the investor does (or should) consider expected return a desirable thing and 
variance of return an undesirable thing. This rule has many sound points, both 
as a maxim for, and hypothesis about, investment behavior...Diversification 
is both observable and sensible, a rule of behavior which does not imply the 
superiority of diversification must be rejected both as a hypothesis and as a 
maxim”. 


Harry Markowitz, “Portfolio Selection”, Journal of Finance, (1952). 
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Most of the time, an investor (private or fund manager) has to choose a 
combination of securities in an uncertain framework. Two main objectives 
can be pursued: 


e Passive portfolio management. The purpose is to replicate a given fi- 
nancial index as best as possible. The implicit assumption is that finan- 
cial markets are efficient; no financial strategy can regularly outperform 
their performances. Clearly, for a given portfolio horizon, a private in- 
vestor will choose a passive management style if the market is presumed 
to be bullish, with relatively high probability. 


e Active portfolio management. If the financial market is not efficient, 
better asset allocations may exist. In this case, financial assets must be 
selected first, second their weighting must be optimized. A fund manag- 
er may also choose between two strategical approaches: the bottom-up 
process where individual securities are selected from their individual per- 
formances or the top-down analysis where, for example, macroeconomic 
and financial forecasts determine the global asset allocation among in- 
ternational securities. 


For the first step, asset selection, the investor needs financial data and 
appropriate estimation methods. The selection of common stock is usually 
also based on earning forecasts and valuation process, such as the discounted 
cash flow models which allow the evaluation of, for instance, price earning 
ratios. 

For the second step, asset weighting, the investor can select a decision rule to 
optimally choose the percentage invested on each asset. This choice criterion 
can be rationalized by using such axiomatics as detailed in Part I of this book. 

However, specific constraints may be imposed on the portfolio allocation 
which induces more involved computational problems. 


This part provides an overview of static portfolio optimization and standard 
performance analysis: 


- First, the optimal weighting for active (but static) portfolio management, 
based on the seminal Markowitz model is discussed. Extensions of this theory 
are also presented. 

- Second, indexed funds (passive management) and benchmark optimization 
(mixture of active and passive asset allocation) are introduced. Most of the 
funds are based on this latter method. 

- Finally, standard performance measures to analyze and rank mutual funds 
are enumerated. 


Chapter 3 


Static optimization 


This chapter is mainly devoted to the mean-variance analysis introduced by 
Markowitz [373]: 

- The fundamental lesson of the Markowitz analysis is to show that investors 
must take care not only of the realized returns but also of the “risk” of their 
position, represented by the standard deviation of their portfolio return. 

- Markowitz proposes to measure the risk of return R by its standard devi- 
ation o(R), and to determine the minimal o(R) for any fixed expected return 
S|R]. The dual criterion can also be considered: maximize the expected return 
for any fixed standard deviation. 














The first part of the chapter details the determination and the analysis of 
the efficient frontier: 


1. Diversification property which allows for reduction of portfolio risk. 


2. Optimal weights computation with the main properties of two-fund sep- 
aration and efficient frontier. 


3. Additional constraints on strategies, taking into account specific bounds 
on portfolio weights. 


4. Parameter estimation problems. 
The second part examines expected utility maximization and/or risk mea- 
sures minimization, such as safety criteria or CVaR minimization: 


1. In particular, conditions ensuring that these additional criteria allow 
choosing one and only one efficient portfolio are examined. 


2. For expected utility maximization, results about the important property 
of two-fund separation are detailed. 


3. For VaR/CVaR minimization, convexity properties are detailed in order 
to use algorithms based on convex programming. 
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3.1 Mean-variance analysis 


Consider an investor with an initial amount Vo to invest on given financial 
assets. The time horizon is fixed, the strategy is assumed to be static (buy- 
and-hold). Expectations of returns and their correlation matrix are assumed 
to be known. 


3.1.1 Diversification effect 


Example 3.1 

To illustrate the impact of diversification, consider the following case: let S1 
and 52 be two financial assets. Let Rı and Ro respectively be their returns. 
Denote by E [R1] and E [R2] the expectations of their returns, and by o? and 
o% their variances. Finally, their covariance is denoted by: 


























O12 = Cov (Ri, R2) = pi20102, 


where p12 (—1 < p12 < +1) is the correlation coefficient between the two assets 
Sı and S2. Consider a portfolio P with a percentage of wealth x invested on 
asset Sı and (1 — x) invested on S2. Its return Rp is such that: 


ù [Rp] = xE [R1] + (1 — x) E [Rə], 


ob = 220? + (1-2) o2 +22 (1 — £) 0102p12. 






































The following figure indicates the set of all such combinations of the two 

assets, according to the value of the correlation coefficient p12. The first axis 

corresponds to the values of ap, and the second to the values of E [Rp] . 
Parameter values: E [Ri] = 10%, E [R2] = 20%, o1 = 0,12, 02 = 0,17. 






































Mean return 


0.20, S2 





0.15 





ad “A /S1 Standard deviation 
0.00 0.12 0.17 
FIGURE 3.1: Diversification effect 
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We note that: 


1) If pig = 1, both expected return and standard deviation of the portfolio 
P are linear combinations of those of the two assets. Therefore, the set of all 
possible portfolios is the segment which lies between the two assets Sı and 
Sp. Additionally, if 0 < x < 1, no portfolio has a lower standard deviation 
than the minimal value, min(cj, 02). 


2) If —1 < pig < +1, we search for the minimal variance portfolio by solving 
the following equation: 


2 
oo BP) L voi (1 — 2) 0% + (1 — 22) 0102p12 = 0. (3.1) 





The optimal percentage is given by: 


2 
* 03 — 0102/P12 
LS —— E a 3.2 
o? + 02 — 20102p12 ( ) 


Its variance satisfies: 


(1 = pip) 0203 


2 
o (Rp) = =>. 
(Rpa) o? + o3 — 20102p12 


(3.3) 
Assume, for example, that the security Sı is less risky than S2 (a1 < 02). 


Then: 3 
o? (o1 = P1202) 


2 On = « 
an aS alae as eae ar 
oi +05 0102/12 


(3.4) 
If p12 # Z, from(3.4) we see that the difference o? (Rp+) — of is always 
negative, whatever the value of the correlation coefficient. Thus, the minimal 
standard deviation is smaller than oj. 


If pig < a then o2 — 0102/12 > 0 and «* > 0: the percentage invested on 
the less risky asset is non-negative. 

If pi2 > oe then o2 — 0102p12 < 0 and a* < 0: this represents the short 
position on the less risky asset. 


If pig = %, the minimal standard deviation is equal to g1. In that case, 
P Co 


there is no diversification effect. 





If p12 = —1, there exists a portfolio with no risk: £z = o2/(01 + 02). 


The diversification allows the reduction of the risk as soon as the correlation 


coefficient between the two assets is strictly smaller than 1, except when 
P12 = Sr 
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Example 3.2 
Consider a more general case with n risky securities (see Merton [387]). 
Notations: for i = 1,...,n, 
w = (w1, ..., Wn) is the vector of portfolio weights. 
R = (R,..., Rn) is the vector of asset returns. 
R = (Ra, ..., Rn) is the vector of asset returns expectations. 
e = (1,...,1) is the vector with all components equal to 1. 
V= lilii jen is the (n x n) variance-covariance matrix of returns. The 
matrix V is supposed to be invertible. 


Denote by A’ the vector deduced from transposition of the vector A. For 
each given expected return, we have to determine the minimal variance port- 
folio. 


The expected return of any portfolio P with weights w is given by: 


n 


[Rp] = X wE [R] = w.R. (3.5) 


i=l 


























The variance of the return of P is equal to: 


n n n-1 n 
o? (Rp) = w'Vw= x X ww; oi; = 2 5 5 WiWjoig + > wo 2 (3.6) 


i=1 j=1 i=1 j=i+1 


The previous relation shows the decomposition of the variance of the portfolio 
return into two components. This relation proves that the marginal contri- 
bution of a given asset to the risk of the whole portfolio is not reduced to its 
own risk (its variance), but also takes account of its potential correlations to 
other securities. This latter property induces the diversification effect. 

From relation (3.6), the partial derivative with respect to any weight w; is 
deduced: ‘ A 

Gaa] =2 y WjOij. 


Ow; = 


Denote by cip the correlation coefficient between asset i and portfolio P. 
Then: 


S wyjoij = S © wjCov(Ri, Rj) = Cov(R (R YeR = Cov( Ri, Rp) = oip, 


and finally: 
Oo? (Rp) 
Ow; 


= 20;P. 
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Thus, the marginal contribution of asset ¿ to the portfolio risk is twice its 
correlation with the portfolio. 

Suppose now that the portfolio weights are equal to Ł, Then, its variance 
is given by: 


n 


o (Rp) =>). Cau 5 (2) ow (3.7) 


i=l i=1 j=it+l1 


It can be written as follows: 


# (Rp) (+) 3 (G) a| i 25 5 PA (3.8) 


i=1 j=i+1 


The first term converges to 0 when n goes to infinity (the variances o?are 
assumed to be uniformly upper bounded). Therefore, the individual contri- 
bution of each asset to the whole portfolio is negligible for large portfolios. 
If returns are independent, then the portfolio risk converges to 0; and the 
diversification effect is to eliminate the risk. 

The second expression involves nina) covariances. Thus this term does not 
converge to 0 if the covariances o;; are also assumed to have absolute values 
in a given interval [Cmin, Cmax] With Cmin > 0. It converges to the asymptotic 
mean of covariances. 





3.1.2 Optimal weights 
3.1.2.1 Case 1: no riskless asset 


Following the Markowitz approach (see also [387]), we have to determine 
the set of portfolios which minimize the variance for given expected returns 
z [Rp]. This leads to the following quadratic optimization problem: 














min w' Vw, 


with w/R = E [Rp], (3.9) 


w’e =1. 














The first constraint corresponds to the fixed expectation level. The second 
constraint is simply that w is a vector of weights. However, shortselling is 
allowed and no other specific constraints are introduced. 

To solve Problem (3.9), consider the following Lagrangian functional: 





L (w,\,6) = w/Vw+ (E[Rp] — wR) + 6 (1—w’e), (3.10) 











where À and 6 are the usual Lagrangian multipliers which are constant pa- 
rameters. Then, Problem (3.9) is equivalent to: 














PEL L (w,d,6) = w Vw+  (E[Rp] — wR) + 6 (1 — w'e). (3.11) 
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The first-order conditions are: 


OL (w,,,0) 














= 2Vw-AR—Se = 0, (3.12) 
Ow 
OLAS) =i =a TOO, (3.13) 
DN 
OL (w,X,0) ne 
eet = l-wle=0. (3.14) 


Moreover, by assumption, the variance-covariance matrix V is invertible. 
Thus, the previous first-order conditions are necessary and sufficient to deter- 
mine the unique solution of this linear system. 


Define the four following real numbers A, B,C, and D by: 


A=eV R.B=RV'RC =e'V'e and D = BC — A’. 














Then, the optimal portfolio weights at the level E[Rp] are given by: 














1 1 = 
w=5 (BV-te—AV-'R) +E [Rp] T (CVR -— AV™te). (3.15) 
Introduce w, and wo: 
1 = 
Wi (BV e= AVR), 


1 ag z 
w2 = 3 (CV 'R- AVe). 














Both wı and wz do not depend on the given level E [Rp]. They are only de- 
termined from financial market parameters: the vector of return expectations 
R and the variance-covariance matrix V. Using wı and wa, we get: 











w = w + E [Rp] .w2. (3.16) 





Therefore we deduce: 


PROPOSITION 3.1 
For any given return expectation level E [Rp], the optimal portfolio exists and 
is unique. Moreover, it can be decomposed as a combination of two basic 
portfolios, since we have: 
































w = (1 — E [Rp]) -w1 + E [Rp]. (w1 + wo). (3.17) 








Any two distinct optimal portfolios generate the set of optimal portfolios. 
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PROOF Examine the latter assertion. Consider two optimal portfolios g 
and h with respective weights wy and wg. Let q be any optimal portfolio. We 
have to prove that portfolio q is a combination of portfolios g and h. More 
precisely, we search for a real number a such that wg = aw, + (1 — a) wa. 

- Since E[R,] # E [R}], there exists a (unique) solution a of the equation: 





















































l [R] = aE [Rg] + (1 — a) E [Ra]. 











- The portfolio p, with weights {a, (1 — a)} invested on g and h, satisfies: 




















Wp = a (w1 + W2E [Rg]) + (1 — a) (wi + w2E [R}]), 
= W1 + W26 [Ra] g 




















Thus wp = wq. I 


REMARK 3.1 The portfolio w; is associated to the expectation level 
2 [Rp] = 1. The portfolio (w1 + w2) corresponds to the level 0. This relation 
proves the so-called “two mutual funds separation” property. 


























REMARK 3.2 When the expectation level E [Rp] is varying, the set of 
optimal portfolios is a half-line included in the set of all possible portfolios. 
Thus, whatever the number n of securities, the optimal set is one-dimensional, 
whereas the set of all portfolios is (n — 1)-dimensional (since 37"_, w; = 1). [ 











From relation (3.15), the minimal variance at the level E (Rp) is computed. 
This yields to the fundamental implicit relation between risk and expected 
returns: 

















o? (Re) (E(Rrp)-A/0}? _ 
m pier (3.18) 


Geometrical interpretation. 














This relation defines an arc of hyperbole in the plane with axis (o (Rp), E (Rp)). 


e Its summit is the point with components (VTE, A/C). 


e Its asymptotes are defined from the equation: 





A DIC 
=5+ Wig a(Rp). 

















(Rp) 
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Mean return 
Standard deviation 
FIGURE 3.2: Mean-variance portfolios 
Therefore: 


There exists one and only one portfolio with the minimal standard de- 
viation among all possible portfolios. This corresponds to the summit 
of the hyperbole. It is usually called the mean-variance portfolio (mvp). 
Its expectation is equal to A/C, and its standard deviation is equal to 


VIIC. 


The set of all possible portfolios is delimited by the arc of the hyperbole. 
Thus, this one is called the portfolio frontier. 


Optimal portfolios with expected returns smaller than A/C are domi- 
nated in the mean-variance sense by those which have expected returns 
higher than A/C: if q is optimal and such that E (Rp) < A/C, then 
there exists an optimal portfolio p with the same risk (orp = oR,) and 
such that E (Rp) > A/C. 


























Consequently, for the mean-variance analysis, the “rational” portfolios 
are those for which the expected return is higher than the expected 
return of the mvp. The set of such portfolios is called the efficient 
frontier. 


REMARK 3.3 (Conditions for which there exists at least an efficient 
portfolio with all weights that are positive.) In that case, there is no short- 
selling. Such results are proposed in [66], [432], and [440]. 
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3.1.2.2 Case 2: one riskless asset 


The same kind of analysis can be used when there exists a riskless asset. 
Denote by w the vector of weights of the n risky assets, and by R the vector 
of returns. The riskless asset has a return denoted by Ry. The percentage of 
wealth invested on this riskless asset is wo. The budget constraint is: 


we + wo = 1 wo = 1 — w'e. (3.19) 





Therefore, the new optimization program is: 


min w' Vw, 


























with w/R+ ae w'e) Ry =E[Rp]. oa 
The Lagrangian functional associated to Problem (3.20) is: 
L(w,A) = w Vw+À (E[Rp] — wR — (1 — w'e) Ry) . (3.21) 
Thus, we have to solve: 
min L(w,A). (3.22) 


{w,A} 


The first-order conditions, which are also necessary and sufficient, are given 
by: 


OL (w,A) 












































a 2Vw-A (R-eRs) = 0, (3.23) 
L = 

oh Ww) = E[Rp] — wR- (1 — w'e) R; =0. (3.24) 

Then the optimal portfolio at the level E [Rp] satisfies: 
w=V!(R-eRș) eI ts (3.25) 

(R—eR;) V- (R-eRy) 
Its variance is given by: 

> 2 

o? (Rp) = w' Vw = acs es (3.26) 


where J = B—2AR;+ CR} is non-negative. 
Therefore, its standard deviation is defined as a function of the expected 
return level E [Rp]: 





Elkel RA if E[Rp] > Ry, 


oa 
o (Rp) = E[R ÍR a, 
-ER Ae if E[Rp] < Ry. 








(3.27) 














PROPOSITION 3.2 
When a riskless asset is available, the optimal frontier is the union of two 
half-lines starting from the new mvp (0, Ry), and with slopes VJ and —VJ. 
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REMARK 3.4 The two-fund separation property shows that any efficient 
portfolio is a combination of the riskless asset and the tangent portfolio, which 
is the only efficient portfolio with nil weight on the riskless asset. The efficient 
frontier with the riskless asset can be decomposed into two parts: 


e First, the segment which lies between theriskless portfolio and the tan- 
gential point t. This is the set of portfolios for which the weight wo 
invested on the riskless asset is positive. Investors who choose such 
portfolios are risk-averse: they prefer a small risk rather than a high 
expected return. 


e Second, the half-line with origin t is the set of all efficient portfolios 
with a short position on the riskless asset. Investors who choose such 
portfolios are less risk-averse than those of the previous case: they search 
for higher expected returns despite higher risk. 


Several cases can occur: 


Case 1) Ry < A/C: the portfolio with the smallest risk (the mvp) has an 
expected return higher than the riskless one. This means that the financial 
market is favorable enough to invest on the mvp (nevertheless, its risk is not 
equal to 0). 


PROPOSITION 3.3 
In that case, the two efficient frontiers with and without the riskless asset have 
one intersection point t corresponding to the portfolio weight w. 


They are tangential at this point, defined by: 














V- (R-eRy) 
Se 3.28 
me ACRA ee) 
The expectation and standard deviation of this portfolio are given by: 
—B -— AR, 
) = w,R—— 3.29 
2R] = wR oA (3.29) 
J 


o? [R = w, Vw ==. 
R= aN a CR: 
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This property is illustrated by the following figure. 

















ive 


FIGURE 3.3: Efficient frontiers (Ry < 4) 


Case 2) Ry > A/C: the portfolio with the smallest risk (the mvp) has an 
expected return smaller than the riskless one. In that case, we have: 














Ry 


A/C 





FIGURE 3.4: Efficient frontiers (Ry > 4) 
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Case 3) Ry = A/C: Relation (3.27) gives: 


(Rp) = 2 $ Be (Rp). (3.30) 


This is the equation of the two asymptotes of the efficient frontier without 
the riskless asset. Thus, the two efficient frontiers have no intersection point. 





























Ry =A/GC---- 





1/VC 
FIGURE 3.5: Efficient frontiers (Rr = 4) 


Obviously, the interesting and usual case is the first case. Nevertheless, for 
a given period (bearish market), case (2) can be observed. 


3.1.3 Additional constraints 


Transaction costs and specific constraints, such as no shortselling (see Green 
[265], Dybvig [181], Alexander [17]) induce a set of constraints on the investor’s 
portfolio. When these constraints are linear, explicit solutions are still avail- 
able. The optimization problem is solved by using Lagrangian methods, as 
shown in the previous section. 

Otherwise, numerical methods must be used. Fortunately, most constraints 
yield to convex programming, such as the following problem: 


min ®(w) 
with we H, (3.31) 
weéK, 


where H is a hyperplane, K is a convex set, and ® is a convex function. 
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Usually, H is determined from given matrix A and vector v: 
H = {w such that Aw = v}, 
and K is defined from a set of inequalities on convex functions: 
K = {w such that U(w) < v} where W is a convex function. 


The convex optimization problems have numerical solutions', but the num- 
ber of assets and the type of constraints may induce computational difficulties. 


However, a special subclass of such constrained optimization problems are 
more easy to handle, i.e., cone programming. Methods based on extension of 
the interior point algorithm can be provided, as for example in Nesterov and 
Nemirovski [399]. The functional Y is linear or quadratic, and the feasible set 
determined from constraints is the intersection of a hyperplane H and a cone 


C. 
For example, the standard mean-variance problem corresponds to: 


U(w) = w Vw, 


nE) 


A no shortselling condition is taken into account by setting C = R+”. More 
general conditions, such as bounds on portfolios weights, can be introduced 
> Wmin < Cw < Wmax (component by component), where Wmin and Wmax are 
fixed vectors and TI is a given matrix. In that case, the corresponding convex 
K is the intersection of two cones. The choice of Wmin, Wmax, and I may 
depend on specific constraints, such as: 


and 














e Particular allocation on bonds and stocks, limits on international diver- 
sification, etc., for institutional investors; 


e Anticipated scenarios about industrial sector returns; 


e In order to diversify, an upper bound of 2% can be imposed on each 
security; and, 


1Special algorithms have been proposed for the mean-variance constrained optimization by 
Frank and Wolfe [240], and Perold [405]. See also Dantzig ([145], [146]) for linear pro- 
gramming, and Boyd and Vandenberghe [86]. Standard softwares propose such constrained 
optimization, but for relatively simple constraints. 
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e To limit the amount of purchase when rebalancing the portfolio between 
two dates tı and t2, the following constraint can be imposed: let V} the 
wealth at time tı. Then the portfolio weight w at time tg satisfies: 


Vi, X X Max(wi, — Wt, 0) < s5, 


a 


where s is a fixed amount. 


Example 3.3 
Consider a portfolio manager who must allocate the fund among five asset 
classes: 


1. Small caps; 2. Big caps; 3. Growth; 4. Value; and, 5. Others, 











with expectations E and standard deviations o as follows (percentage %): 





TABLE 3.1: Expectations, variances and covariances 











MEFE 
| 





Rs = 20 034 = 2.05 035 = 2.75 
Ra =14 045 = 1.70 
R5 = 13 


e Figure 3.6 corresponds to the case where only shortselling is forbidden. 





e Assume now that the investor wants to buy at least 10% of small caps, 
to invest at least 10% on big caps, and at most 80% on the group which 
contains the big caps and the values. Then, the new constrained efficient 
frontier is dominated by the previous one (Figure 3.7). 


e Finally, if the investor wants to invest at least 10% on small caps, at least 
10% on the fifth class, at least 20% on each other class, and finally, at 
most 80% on the group of big caps and values, then, a third constrained 
efficient frontier is dominated by all the previous ones (Figure 3.8). 
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Tracking Error Efficient Frontier 
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FIGURE 3.6: Efficient frontier with no shortselling 
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FIGURE 3.7: Efficient frontier with additional group constraints 
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FIGURE 3.8: Efficient frontier with maximum number of constraints 
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3.1.4 Estimation problems 


Mean-variance analysis heavily relies on “good” prediction of the mean and 
variance-covariance matrix of the securities. Indeed, mean-variance optimal 
portfolios are significantly sensitive to parameter values, as shown, for in- 
stance, in Best and Grauer [65]. These values can be estimated from financial 
data. However: 


e Using past data to predict future returns means that we assume stability 
of parameters through time. Generally, the variance-covariance matrix 
is more stable than the mean return. 


e Parameter estimations may be involved when, for example, the portfolio 
is composed of 150 to 250 securities. 


e Numerical problems are also posed by the computation of the inverse 
variance-covariance matrix, as illustrated in particular by Stevens [480]. 


The most efficient method in terms of saving calculation time is to simplify 
the matrix structure. For this reason, specific hypotheses can be introduced: 


e The assumption that factor models such as Sharpe’s market model are 
valid to predict asset returns from financial indices or macroeconomics 
factors. 


e The assumption that some correlation coefficients are identical; for ex- 
ample for stocks belonging to the same industrial group. 


In this framework, some of the following methods can be used: 
1) Single-index and multi-index models (see Chapter 5). 


For example, assume that asset returns satisfy the Sharpe market model: 
for alli € {1,...,n}, for any t € [0, T], 











a(Rie] = ait + Bit Rue + Eit, (3.32) 





where Rm, denotes the return on the market index and £; denotes the spe- 
cific risk on asset i, non-correlated with the market return. The parameters 
a; and 3; 4 are determined from linear regression of the market returns on 
the asset returns for the same period. In particular: 


Bie = Cov( Rit, Rut)/Var(Rm,t). (3.33) 


Then the total risk of the asset 7 has two components: a term for systematic 
risk (“the market risk”) and a term for non-systematic risk (“the diversifiable 
risk”): 


Var(Rit) = Be, out + oR; with Var(Rut) = out and 02; = Var(é;,t). 
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Therefore, for a single-index model, it is sufficient to calculate the covari- 
ances of each asset with the index: for n assets, the number of operations is 
n instead of n(n — 1)/2. Then, the matrix inversion is greatly simplified (see 
Bartlett [45}). 


2) The method of Elton, Gruber, and Padberg ({198],{199]) (see also Chapter 
9 of [200] for more details). 


The method uses an optimal ranking of the assets, with the help of the 
simplified correlation structure. A threshold C* is determined. Then returns 
below this threshold are rejected. 


- The ratios AUR Fe are ranked from the highest to the lowest. This 


induces the ranking of assets: i1 < ... < in. The higher the value of the ratio, 
the more desirable to include the asset in the portfolio. Therefore, if an asset 
is rejected, then all following assets are also rejected. 














- The cut-off ratio C* is determined through an iterative procedure, as 
follows. Consider the successive ratios associated to portfolios containing the 
l first assets. These ratios are given by: 


l E[Ri,|—Ry 
ox are o Ps 
qsen ae a (3.34) 
1+ OM ae a 


Note that the ratio C; is also equal to: 


— Êm (E[Ru] — Ry) 

Br 
where rm is the expected change in the rate of return on asset l with 1% 
change in the return on the optimal portfolio. 














Ci (3.35) 


- Consider the unique /* for which all assets 7; such that 7; < /* have ratios 
E[R:,| — Ry) /Gi, higher than C;,, and for which all assets with i; > l* have 
ratios smaller than C;,. Therefore, an asset 7; is included in the portfolio 
if its mean excess return, E[R;,] — Ry, is higher than the optimal portfolio 
containing the first 7; assets. 














~ 














3) Average correlation models. 


The averaging data in the historical correlation matrix has been examined 
in [197] to forecast future values. This method assumes that the past cor- 
relation matrix provides information about what the average correlation will 
be in the future but contains no information about individual correlations. 
Therefore, the main assumption is that usual groups of industrial stocks have 
common correlations. Such problems can be also examined for international 
diversification, as in Longin and Solnik ([363],1364]). 
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4) Bayesian approach. 


The Bayesian estimation process is based on one hand on prior knowledge 
of market parameters (“the investor’s experience” ), and on the other hand, on 
information from market observations. Then, a posterior return probability 
distribution can be determined. Such an approach has been examined by Stein 
[478], Frost and Savarino [244] and others. Black and Litterman [74] also use 
Bayes rule to reduce the sensitivity of the optimal allocation to parameter 
choices. Meucci [390] provides details about the Bayesian procedure. 


REMARK 3.5 The empirical observations lead to the following conclu- 
sions (see, e.g., Elton et al. ({198],[199]), Chan et al. [113], Zeng and Zhang 
[511].) : 


e The estimation of the variance-covariance matrix is easier than the mean 
return estimation. Estimation from individual return data does not 
provide efficient forecasts. Multi-index models and average correlation 
models give more accurate values by reducing the impact of fluctuations 
(see, for example, Eun and Resnick [214]). However, significant estima- 
tion errors may remain, as mentioned by Jobson and Korkie ((305],[306]), 
and Chopra and Ziemba [122]. 


e Taking account of additional constraints, in particular the no short- 
selling condition, partly reduces estimation errors, as shown by Best and 
Grauer ([66],[67]) and by Frost and Savarino [245]. Moreover, taking 
more account of actual strategies of fund managers, these additional 
constraints often improve portfolio performances. 
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3.2 Alternative criteria 


The Markowitz approach is not directly based on expected utility maxi- 
mization or risk measure minimization. However, under specific assumptions, 
the optimal solutions of the two previous problems are mean-variance efficient 
portfolios. 


3.2.1 Expected utility maximization 
3.2.1.1 Expected utility and mean-variance analysis 


The mean-variance analysis implicitly assumes that the investor’s utility, 
defined on the portfolio return Rp, is a function V(Rp) which depends only 
on the mean and the variance: 


V(Rp) — f ( E (Rp) Gr (Rp)) ’ (3.36) 














and is increasing with respect to the mean (4 > 0), and decreasing with 


respect to the variance (25 < 0). This kind of utility function is defined as 
a mean-variance utility function. 

Consider an investor with an initial amount Vo and a time horizon T who 
uses a “buy and hold” strategy (static portfolio optimization). In that case, 
the expected utility maximization is equivalent to the search of a vector of 
weights, w = (w1,...,Wn), invested on n securities, which is the solution of: 











max Ep [U (Vr)]. (3.37) 


w 





Epstein [210] proves that mean-variance utility functions are implied by a 
set of decreasing absolute risk aversion postulates. 
For two main cases, the optimal solution is indeed mean-variance efficient: 


e The utility is quadratic: U(x) = x — £2?, with k > 0 defined on the set 


















































] — œ, 1/k] on which it is increasing. Then: 
Ep [U (Vr)] = E(Vo.Rp) — 5.Vo E (Rp) = f (E(Rp),0* (Re) , 
(3.38) 
with 
k 
J (E(Rp) ,o? (Rp)) = U(Vo-E(Rp)) — z” (Rp). (3.39) 
Since the portfolio value is assumed to be in the set | — œœ, 1/k], then 














the quantity U (Vo.E (Rp)) is indeed an increasing function of the mean 
(Rp). Therefore, any optimal solution is mean-variance efficient. 
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e The utility is exponential: U(x) = — sen Aa] with A > 0, defined on R*. 
Moreover, the return distribution R is supposed to be Gaussian. Thus, 
the portfolio return Rp also has a Gaussian law since Rp = w.R and 
the Gaussian probability distribution is stable. Therefore, the utility 
U (Vr) has a lognormal distribution. Recall the following property: if 
the random variable X has a Gaussian law NM (m, o), then 














2 
z [e*] = exp [m+ = í 


Using this expression, we deduce: 




















Bp [U (Vr)] = —(1/A)exp[—A (E (Rp) — A/2.0?(Rp))]. (3-40) 








Therefore, the maximization of this expected utility is also equivalent 
to the maximization of a quadratic utility with: 


























f (E (Rp), 0? (Rp)) = E (Rp) — A/2.07 (Rp). (3.41) 


REMARK 3.6 For the previous cases, the functional to maximize has 
the following form: 














V(Rp) =E(Rp) — a (Rp) where ¢ > 0. (3.42) 
The parameter ¢ is the marginal substitution rate between the mean and the 
variance. It can be viewed as an aversion to the variance. 


Let us examine this kind of problem. For a given aversion to variance 4, 


we have to solve: 


max w’/R— 2w Vw, 
w 


À (3.43) 
with w'e =1. 
The Lagrangian associated to Problem (3.43) is defined by: 
L(w,A) = wR-S.w/Vwt (1 — w'e) (3.44) 
The first-order conditions are given by: 
OL(w,A) = 
CENA) SR vaio ag, (3.45) 
Ow 
L 
oh Ww) =1-we=0. (3.46) 


Solving this linear system, we deduce: 
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PROPOSITION 3.4 
The optimal solution is equal to: 














w =< (voR- Afv) ; (3.47) 

Its mean and variance are given by: 
a (Rp) = wR + a (3.48) 
o? (Rp) = w Vw = so + A (3.49) 


When the parameter ¢ is varying in RY, the set of optimal solutions is 
exactly equal to the set of efficient portfolios. 


REMARK 3.7 When the aversion to variance ¢ goes to infinity, we have: 














A 1 
(Rp) > G and o? (Rp) > a (3.50) 


In that case, the investor chooses the mvp portfolio. 
When the aversion ¢ goes to 0, we have: 











(Rp) — +00 and o? (Rp) > +00. (3.51) 





The smaller ¢, the higher the mean, but also the higher the risk. l 


REMARK 3.8 As can be seen, the main advantage of mean-variance 
utility functions is the simplicity of the determination of optimal solutions. 
Moreover, when convex constraints are added, such problems as 

max wR — 2. wVw, 


w 3.52 
with w'e =1 and w € K, ( ) 


have numerical solutions computed with efficient algorithms (see Section 3.1.3 
and [405]). 


3.2.1.2 Optimal weights for expected utility maximization 


The first-order condition of problem (3.37) is given by: for alli € {1,...,n}, 











E[U' (VoRp) (Ri — Ry)] = 0, (3.53) 





where the portfolio return Rp is equal to Rf + Ea wi(Ri — Rẹ). 

The solutions of the previous system of equations are generally implicit, and 
numerical algorithms are needed to analyze them. However, for an important 
class of utility functions, the HARA family, qualitative results can be derived, 
in particular the two-fund separation. 


88 Portfolio Optimization and Performance Analysis 


3.2.1.3 Two-fund separation 


The mean-variance efficient portfolios have the two-fund separation proper- 
ty. This is not the only case for which this property is satisfied: for example, if 
there exists a riskless asset, other utility functions imply two-fund separation. 
In addition, particular probability distributions can also induce separation 
property. 


3.2.1.3.1 Two-fund separation and utility function Indeed, Cass 
and Stiglitz [108] provide necessary and sufficient conditions on utility func- 
tions to get the two-fund separation. 


PROPOSITION 3.5 
For the static portfolio optimization problem: 

1) The two-fund separation property is satisfied for any probability distri- 
bution of returns if and only if the investor has a quadratic utility function. 


2) If there exists a riskless asset, then a necessary and sufficient condition 
for investors to have the same percentages of each risky asset is that they have 
an absolute risk aversion (ARA), such that for any wealth level V, 


1 


(3.54) 
where the parameter B is the same for all investors and is assumed to be 
positive. Thus, they have the same HARA utility function (up to a linear 
transformation). They differ only by their wealth Vo. 


PROOF We examine only the sufficient condition (see [108] for the com- 
plete proof): 

Note that Condition (3.54) implies that the marginal utility U’ has the 
form: 


1) If 8 # 0, denote a = A/(GB). Then: 


U'(V) =(A+ BV) F. 


i) = ees (-=) 


2) If 68 = 0: 


Q Q 


For the HARA utility functions, note that the first-order condition (3.53) 
is equivalent to: if G Æ 0, 





Ry +X wi(Ri— Rp) 


i=l 











E (a + BW 








THE: 
) (R:—Ry)| =0. (3.55) 
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If A = 0 (CRRA case), then the optimal weights wy and w = (w1,..., Wn) 
invested on the n risky assets, which are the solutions of the system of e- 
quations (3.55), do not depend on BV. They are the same for all investors 
having CRRA utility functions with the same parameter 3. Denote them by: 
CRRA _ (wORRA ey WORRAY 


n 


nes and w 


If A £0, define C by: 
BVo 


Q= 2 
A+ BWR; 


Then, Equation (3.55) is equivalent to: 














i=1 


Therefore, for all i € {1,...,n}, the value of Cw; does not depend on the 
parameters of the HARA utility function. 


Thus, the ratios of optimal weights, w# ARA invested on the risky assets do 
not also depend on BV. For parameter (3, they satisfy: for all i, j € {1,...,n}, 


gees = we 
wp ARA WF RRA g 


The same result is deduced when 8 = 0. 


REMARK 3.9 (Determination of the utility function) 
1) If 8 £0, the utility function is given by: 
8 2a 


Gop (A+ BY)” ifp, 


U(V) = $ In (A+ BV) if 6 = 1. 


U(V) = 


2) If 8 =0: 
Ge eee -=| 


[ 


Examine the properties of optimal weights for the HARA case. Denote by 
w* = (wł, ..., w3) the percentages of risky assets in the mutual fund which 
depend only on the parameter 3. Note that X; wž = 1. This portfolio is 
the optimal portfolio of an investor with a CRRA utility function (A = 0). 
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PROPOSITION 3.6 
For HARA functions, the optimal weights are functions of the CRRA optimal 
weights. 


PROPOSITION 3.7 
(Determination of the optimal weights) 
1) If B #0, the optimal weights are given by: 


wi = vw; and wy = 1 — v, 





ith v = (1— à) {1 
MUNE i E 
where the parameter A is equal to wy for the CRRA utility function. 
Thus, v is the percentage of risky investment, and (1 — v) the percentage of 
riskless one. 
2) If 8 =0, the optimal weights are given by: 


(1—A)a * 

wi = |—— ] w}, 

ie i 

Aa V-a 
+ 


mo = Vo 





REMARK 3.10 For the first case, note that all investors with CRRA 
utility functions have the same optimal portfolio (wy and w) whatever their 
initial wealth. For A Æ 0, the percentage v of risky investment is decreasing 
w.r.t. the initial wealth. More precisely, the demand upon each risky asset i 
(i.e., the amount invested on i) is linear in wealth. For the second case, the 
demand upon each risky asset 7 is constant. 


Another approach is based on the characterization of probability distribu- 
tions that imply two-fund separation. 


3.2.1.3.2 Two-fund separation and probability distribution Multi- 
variate Gaussian distributions are particular solutions of this problem. They 
are special cases of the spherical distributions, defined as follows: 


DEFINITION 3.1 A random vector R has a spherical distribution if for 
every orthogonal map L € R”*"(i.e., L'L = LL' = I, where I, is the n- 
dimentional identity matrix), the random variables R and L.R have the same 
distribution. 


This means that the distribution of a spherical random variable is invariant 
to rotation of the coordinates. 
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DEFINITION 3.2 A random vector R is determined from its character- 
istic function pr, which must satisfy: for any vector t € R”, 














yr(t) = E [exp (it’.R)] = (t'.2), 


where w is some scalar function and usually called the characteristic generator 
of the spherical distribution (see Fang et al. [220] for properties of such a 
distribution). 


Examples of the spherical distributions are the Gaussian distributions, the 
student-t distributions, the logistic distributions, etc. 


Chamberlain [112] provides the complete family of distributions that are 
necessary and sufficient for the expected utility of final wealth to be a mean- 
variance utility function. 


PROPOSITION 3.8 


1) If there is a riskless asset, then the distribution of any portfolio is deter- 
mined by its mean and variance if and only if the random vector of returns R 
is a linear transformation of a spherically distributed random vector. 


2) If there is no riskless asset, then the spherically distributed random vector 
is replaced by a random vector in which the last n — 1 components are spher- 
ically distributed conditional on the first component, which has an arbitrary 
distribution. If the number of assets is infinite, then there must exist random 
variables X,Y, and varepsilon, where the distribution of € conditional on X 
and Y is standard normal, such that every portfolio is distributed as some 
linear combination of X and Ye. (If there is a riskless asset, then X has zero 
variance.) 


The general result is provided by Ross [433]. 


PROPOSITION 3.9 


For a risk-averse investor, necessary and sufficient conditions to have the 
two-fund separation property are the following ones: 


Case 1: There is no riskless asset. The condition is that there exist two 
random variables X and Y, and two portfolios w™ and w®), such that the 
risky asset returns R; can be written as follows: for alli € {1,...,n}, 


R,=X+aY +&, (3.56) 
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with: 














E le: 





X,¥|=0 wa SuPs Z =0, (3.57) 


Ya Sah Za = V wai. (3.58) 
i=1 


i=l 


This condition is equivalent to the existence of two portfolios w® and w®), 
such that the risky asset returns R; can be written as follows: for alli € 


{1,5 Tbs 
R= RY ARR — RY”) 


J E 


Case 2: There is a riskless asset with return Rf. The conditions are the 
same if we set X = Ry and wy) = 1, w® = (0,...,0). 


+ &, 
with: 


(2) a) 
RY? RY? | =0. 

















PROOF We examine only the sufficient condition (see [433] for the nec- 
essary condition). 

Using the risk notion of Rothschild and Stiglitz ([435],[436]) (see Chapter 
1), it is sufficient to prove that any portfolio w? is more risky than the 
combination w(}?) of w and w), which has the same mean return. From 
Condition (3.56), we have: for all i € {1,...,n}, 


n n n 
Rp=) wf R=X+) wPaY +Y > wh. 
i=l i=1 i=1 


From Condition (3.58), there exists always a real number y?: 


So wP ai =P (>: ua) +0 = yP) (>: ua) af yPaM4(1 _ xP) a2. 
i=l i=l i=l 


Consequently, the portfolio return Rp can be written as: 

Re =o (E+ So Qj 2 (1-7 P (Foe aj F) + Souh Ej. 
Saen (aah) Re 4 5 wP. (3.59) 

Consider now the portfolio w? = yPw + (1 — qP) w. Then: 


n 
(1,2) : 2 
Rp = RY +e” withe” =S wfe 


Static optimization 93 


From Condition (3.57), 


n 
a Px 
H J W; Ei 
t=1 


Therefore, since we have: 








Rw? 














= 0. 














n n 
= PR Pr. 
= > w; E > w; Ej 
i=1 


i=l 





X+ 5 wP aY 
i=1 

















Rp = RY? +e? with E |e” ee | =0, 


the portfolio Rp is more risky than Rw’? (according to Rothschild and 
Stiglitz) and cannot be optimal. Case(2) is proved in the same manner. 1 


REMARK 3.11 When the investor is risk-averse and no riskless asset is 
available, the two-fund separation property is based on two conditions: 

- First, the return of any asset i is generated by two common factors. 

- Second, there exist two portfolios, w® and w®), with distinct mean 
returns and without residual risks. 

- As proved in Chamberlain [112], if for any asset i, there exists a; € R and 
b; € R such that: 





Ri = a; X + biY £ with E fe |X, Y] = 0, (3.60) 











then, the property of two-fund separability is satisfied, even if the utility 
function is not concave. 


3.2.2 Risk measure minimization 


If investors are wary of downside risks, they want to choose portfolios with 
small probabilities of loss. As seen in Chapter 2, they may search to minimize 
the probability of having returns under a given level (this refers to VaR), or 
the expectation of the losses under this level (this refers to CVaR). 

Attempts to solve this kind of problem are found in Roy [437], Telser [490], 
and Kataoka [324], when portfolio returns have Gaussian probability distri- 
butions. 


3.2.2.1 Safety first 


3.2.2.1.1 Roy’s criterion Let Rmin be the minimum return fixed by the 
investor or by statutory conditions for particular funds. Roy’s criterion is 
the minimization of the probability to get a portfolio return lower than Rmin- 
Then, the optimization problem is: 


min P(Rp < Rmin): (3.61) 


Assume that the vector of returns has a multivariate Gaussian distribution 
and no riskless asset is available. Since any portfolio return also has a Gaussian 
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law N(E[Rp],op)., then, by normalizing the return Rp, the Roy problem is 
equivalent to: 


























min P (= — E[Rp] 2 Rmin — Zen l 
w OP OP 














Now the probability distribution of the random ae Ea is V(0, 1). 
op 
Rmin = E[Rp] 














Thus, the problem is equivalent to the minimization of , and 


D [Rp] — Rmin 
op 

rical point of view, in the Markowitz plane (op,E[Rp]), this latter ratio is 

the slope of a half-line starting from point (0, Rmin). 














also to the maximization of the quantity a = . From a geomet- 














Its equation is: 














Rp] = a op + Rmin. 


Therefore, under the Gaussian assumption, the Roy portfolio is on the half- 
line starting from point (0, Rmin), having the maximum slope and a non-empty 
intersection with the set of all possible portfolios. 


The solution is tangent to the efficient portfolios curve, as illustrated by 
the following graph: 

















= RMin + K30(Rp) 



























= Ruin + K2o0(Rp) 




















—= Ruin + Kıo(Rp) 





RM in 


FIGURE 3.9: Roy’s portfolio 
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The half-line with the lowest slope is mean-variance dominated by the tan- 
gent, which would be dominated by the third half-line, but this one has an 
empty intersection with the set of portfolios. 


The determination of Roy portfolio is, for example, deduced by searching 
i|Rp] =. Rmin i 














the value of the maximum ratio a = 














ol 
1) Substituting a op + Rmin for E[Rp] in the hyperbole’s equation induces 
a quadratic polynomial equation with unknown variable ø. The half-line is 
tangent to the efficient frontier if and only if its discriminant A is equal to 0. 
2) A is itself a quadratic function of the ratio a. Thus, we have to solve this 
second equation. The optimal ratio a is the highest solution. 
Tangent conditions can also be used to determine Roy’s portfolio. 


3.2.2.1.2 Telser criterion ‘Telser’s portfolio is the portfolio which has 
the highest return expectation under the safety conditions. We have to solve: 











max E[Rp], 





with P(Rp < Riin) [L E; (3.62) 
where both the minimum return Rmin, and the threshold €, are fixed. 
Under the Gaussian assumption, the safety condition is equivalent to: 


Rmin = [Rp] 
Op 














< Te, 


where ze is the quantile of the standard Gaussian distribution at the level e. 














In the Markowitz plane (op, E[Rp]), the set of solutions is the area above 
the line defined by E[Rp] = Rmin + (—2-) op. 














For example, if € = 5%, we have xe ~ —1.65. Then the safety condition is 
approximately: 











[Rp] > Rmin + 1.65 op. 





The Telser portfolio is the intersection point of the efficient frontier with 
the half-line having the equation: 











i|Rp] = Rin + (—ze) OP, 





as illustrated by the following graph. 
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u(Rp) 

















FIGURE 3.10: Telser’s portfolio 


3.2.2.2 Kataoka criterion 


The idea of Kataoka is to search for the maximum value of the minimum 
return Rmin, guaranteed at a fixed probability threshold e. We have to solve: 


max Raini 
w 


with P(Rp < Rmin) < €. (3.63) 


Under Gaussian assumptions, this problem is equivalent to: 


max Rmin 
w 











q[Rp| a Rmin 


Op 





with > —Te. 


In the Markowitz plane, consider the half-lines with same fixed slope (— ze). 
Then, the Kataoka portfolio is the tangent point of the efficient frontier with 
the unique half-line having the given slope — xe, and the value of Rmin is given 
by the intersection with the vertical axis (see Rmin2 in the Figure 3.11.) 
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FIGURE 3.11: Kataoka’s portfolio 


REMARK 3.12 The three previous criteria are based on a fixed level 
guaranteed at a given probability threshold. Under Gaussian assumptions, all 
optimal solutions are necessarily mean-variance efficient. As a consequence, 
a unique efficient portfolio can be determined by choosing the values of Rmin 
and/or the threshold e. 


e Roy criterion is based on the search of a true guarantee, since it is for 
this criterion that the level £ is the lowest. For example, when market 
volatility is high, an investor may want to reduce market risk. 


e If market parameters are assumed to be well estimated, and the min- 
imum level Rmin and the threshold € can be easily determined, then 
“rationally,” the investor maximizes the return expectation. This is the 
purpose of Telser criterion. 


e In the same framework, Kataoka portfolio has the most attractive guar- 
antee for a given probability threshold. 


REMARK 3.13 Obviously, other probability distributions than the Gaus- 

sian one must be used if asset returns exhibit skewness and fat tails, in par- 
ticular when some payoffs are not linear with respect to given basic securities. 
Then, optimal solutions may no longer be mean-variance efficient. Numer- 
ical algorithms are generally needed to determine and analyze the optimal 
solutions, as shown in what follows. 
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3.2.2.3 CVaR Minimization 


Alternatively, we can search to minimize the CVaR (“expected shortfall” ) 
with or without specific constraints. For normally distributed loss functions, 
mean-variance (MV) and VaR/CVaR minimizations are equivalent (see com- 
putation of VaR and CVaR for the Gaussian case in Chapter 2). Otherwise, 
especially for asymmetrical distributions, MV and VaR/CVaR portfolio opti- 
mizations may lead to significantly different solutions. Note also that unlike 
in the mean-VaR optimization problem, mean-shortfall optimization can be 
solved efficiently as a convex optimization problem, as shown in Bertsimas et 
al. [55]. 


General results about VaR and CVaR minimizations are provided in Rock- 
afellar and Uryasev ([427], [428]). They show that these problems can be 
based on a particular representation of the performance function, which then 
allows use of analytical or scenario-based optimization algorithms. When the 
number of scenarios is fixed, the problem is solved by linear programming or 
by non-smooth optimization methods. 


Consider the loss function l defined on weights w and return R. Let us 
assume that the vector of returns R has a density fr (this assumption is not 
critical). For a € (0,1), the a-VaR denoted by qa(w) is given by: 


da(w) = min {q E€ R : Pil(w, R) > al}. 


The a-CVaR denoted by ¢a(w) is equal to: (notation: zt = max(z,0)) 


alw) = (1 - ay f Ii(w, u) fr(u)du. (3.64) 
U(w,u) > de (w) 

As can be seen, the a-CVaR ¢q(w) is the conditional expectation of the 
loss associated to the vector of weights w relative to that loss being equal to 
da(w) or greater than qa(w). 

The key idea is to characterize both a-VaR ga(w) and a-CVaR ¢a(w) by 
means of a convex function Fy(w,q) defined by: 


n 


Peg Spa | Hear al faldu (65) 
We have (see [427]): 
PROPOSITION 3.10 


The function F4(w,q) is convex and continuously differentiable w.r.t. q. The 
a-CVaR ¢a(w) of the loss associated with any vector of weights is given by: 


alw) = min Fa(w, q). (3.66) 
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The set Ag(w) of values q for which the minimum is attained, namely 


Aa(w) = in Fy(w, q), 
(w) arg min (w,q) 


is a bounded interval which is non-empty and closed. It can be reduced to a 
single point. Its left endpoint is the a-VaR qa(w). Using convexity results 
(see e.g., Rockafellar [426]), the existence of a unique solution can be proved 
under specific assumptions which eliminate a local but not global minimum. 


Then, a characterization of the a-CVaR can be deduced (see [427]): 


PROPOSITION 3.11 
The minimization of the a-CVaR ha(w) is equivalent to minimizing Fa(w, ¢q) 
over all (w,q): 
in do = min F,(w,q). 3.67 
min da(w) Pen (w, 4) (3.67) 

Moreover, any pair (w*,q*) is an optimal solution of the right-hand side 
if and only if w* is an optimum of the left-hand side. When the interval 
Aa(w) reduces to a single point, any solution (w*,q*) of the minimization of 
the function Fy(w,q) is such that w* minimizes the a-CVaR, and q* is the 
corresponding a-VaR qa(w). 


Note also that Fa(w,q) is convex w.r.t. (w,q), and ¢a(w) is convex w.r.t. 
w if the loss function I(w,q) is convex w.r.t. w. Moreover, if the set A of 
portfolios weights is convex, the optimization problem relies on convex pro- 
gramming (see Section 3.1.3). 


REMARK 3.14 This approach allows us to avoid the VaR computation. 
The function Fa(w,q) can also be estimated from financial data or simulated 
from the Monte Carlo method for a given density fr. 


3.2.2.4 Efficient frontier 
When loss functions are normally distributed, MV and VaR/CVaR opti- 


mizations generate the same efficient frontier. 


Another problem is the equivalence representations of efficient frontiers with 
concave reward and convex risk functions. 


This kind of result is known for the mean-variance case (see Steinbach [479}), 
and for mean-regret performance functions, see Dembo and Rosen [157]. 
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Krokhmal et al. [336] provide a general result about this equivalence, as 
shown in the next proposition. 


PROPOSITION 3.12 
Let us consider the functions ®(.) (“the risk”) and Y(.) (“the reward”) w.r.t. 
the vector of weights. Consider the following three problems: 


(P1) minw ®(w) — aV(w), a>0, we A, 
(P2) minw ®(w), U(w) > b, we A, 
(P3) minw -Y (w), ®(w) <c, we A. 


When the parameters a,b, and c are varying, three corresponding efficient 
frontiers are generated. 

Assume that constraints U(w) > b and ®(w) < c have internal points 
(under some regularity conditions from duality theory). If ®(.) is convez, 
W(.) is concave, and A is convex, then the three efficient frontiers are equal. 
This is the case when ®(.) is the a-CVaR pa(w) and U(.) is the mean return. 


3.3 Further reading 


There is a huge amount of literature concerning static portfolio manage- 
ment and, in particular, mean-variance analysis, both from the theoretical 
and empirical points of view. The book of Elton and Gruber [200] provides a 
general overview about standard problems. The classic texts on portfolio op- 
timization are the books of Markowitz ([375] and [376]). The book of Meucci 
[390] contains details about the Bayesian approach and computational meth- 
ods when additional constraints are introduced on portfolio weights. The 
diversification property for mean-variance efficient portfolios is analyzed in 
Green and Hollifield [266]. Large-scale optimization is studied in Perold [405], 
Best and Kale [69], Bixby et al. [72], Levkovitz and Mitra [351], and Mulvey 
et al. [395]. Extension of mean-variance to more general complete markets 
are examined in Dybvig and Ingersoll [182]. Nowadays, mean-variance op- 
timization is also applied on asset class level. This is due to the increasing 
range of asset classes, and also to the easier estimation of Markowitz inputs 
than for individual securities. As illustrated and mentioned for instance in 
Lederman and Klein [345], global asset allocation, in particular international 
diversification, and strategic asset allocation for long-term investment can be 
based on this approach. 

Michaud [391] examines statistical properties and their significant impact 
on portfolio optimization. Ledoit and Wolf [346] examine variance-covariance 
matrix tests when dimensionality is large. Arch models can also give better 
ex-post predictions of the variance-covariance matrix. They allow for intro- 
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duction of factor models based on GARCH processes, which describe volatil- 
ities and correlations of securities, as shown in Herzog et al. [292]. Basic 
results about ARCH models can be found in Bollerslev et al. [80], Gourieroux 
[260]. 

Utility maximization is mainly developed in a dynamic framework, as can 
be seen in Chapters 6 and 7. Kroll et al. [338] examine the relation between 
mean-variance analysis and utility maximization. Kallberg and Ziemba [315] 
compare optimal portfolio weights for different utility functions. Portfolio op- 
timization can also be based on higher-moments, involving skewness and kur- 
tosis. A fourth-order Taylor approximation of a utility function leads to such 
criterion. A three-moments portfolio choice has been analyzed by Athayde 
and Flôres [35] for a maximum skewness portfolio. The four-moments case is 
examined by Jurczenko and Maillet [312], and by Malevergne and Sornette 
([371],[372]). The latter articles also show how centered (absolute) moments 
and some cumulants are consistent measures of risks which can be used to 
generalize the mean-variance approach. Efficient frontiers for stochastic dom- 
inance can be also examined, as in Ruszczynski and Vanderbai [444], and 
Darinka and Ruszczynski [147]. Prospect theory and its connection to mean- 
variance analysis is examined in Lévy and Lévy [353]. 

When probability distributions are not Gaussian, the safety first condition 
can be studied by means of extreme value theory as in de Haan et al. [278]; 
or also by using the theory of great deviations. Giacometti and Lozza [252] 
examine risk measures for asset allocation. They compare, in particular, port- 
folio choice on both historical data and simulated returns with jointly stable 
non-Gaussian returns. Differences between MV and CVaR efficient frontiers 
are illustrated in Mausser and Rosen [379], and Anderson et al. [25] for credit 
risk portfolio management, and Larsen et al. [344], and Chabaane et al. [111] 
for hedge funds management. Gaivoronski and Pflug [251] show that VaR 
and CVaR efficient frontiers can be quite different. Alexander and Baptista 
[18] compare the mean-VaR approach with mean-variance analysis. 

Other criteria can also be examined such as mean-absolute deviation (see 
Konno et al. ([329],[331]), minimax rule (see Cai et al. [101], Teo and Yang 
[491], and Young [508]). Athayde [34] studies the minimization of downside 
risk. Chekhlov et al. [116] provide portfolio optimization results with draw- 
down constraints. 

Acerbi and Prospero [4] examine the minimization problem of spectral mea- 
sures of risk. They prove that minimizing risks with constrained returns, or 
maximizing returns with constrained risks (standard risk-reward problem) 
coincides with the unsconstrained optimization of a single suitable spectral 
measure. This means that “minimizing a spectral measure turns out to be 
already an optimization process itself, where risk minimization and returns 
maximization cannot be disentangled from each other.” 


Chapter 4 


Indexed funds and benchmarking 


4.1 Indexed funds 


The goal of passive management is to achieve returns identical to a spe- 
cific benchmark. It is based on holding a basket of securities designed, for 
example, to track a broad-market index which matches as closely as possible 
the return of the overall stock market. The most commonly used strategy is 
market capitalization weighting. No specific judgment is needed, and the risk 
of variation from the accepted asset-class benchmark is reduced as much as 
possible. 


Why invest in global passive funds? 


The first justification is the cost: Systems and personnel costs are lower 
than with active management since, once the procedure is in place, it is s- 
traightforward to use it. Transaction costs are also reduced. Indeed, since 
transactions are informationless, commissions are significantly smaller. Taxes 
are reduced since active portfolio strategies have a higher turnover. More- 
over, bid-offer spreads are smaller because passive funds involve less small- 
capitalization stocks. Funds such as “trackers” also have low transaction 
costs. 

The second is the performance: According to financial data, global indexes 
such as the S&P 500 and EAFE beat the median manager performance most 
of the time. The implicit assumption is that financial markets are efficient, 
and no manager has a superior performance. In fact, less than 20% of actively 
managed mutual funds, diversified on large-capitalizations, have outperformed 
the S&P 500 over the last 10 years. According to SEI Investments, more than 
half of the funds of the Euro area, which had performances above the average 
during the period 1998-2000, had lower performances than the average during 
the next two years. 


Indexed funds are relatively recent. Until 1993, only about 5% of the total 
amount of mutual funds were invested on indexed funds. However, during 
the year 1999, about 38% of new investments have been based on passive 
management, which has become very popular. 

When the goal is to reproduce as closely as possible a given financial index, 
the passive management is called an index tracking strategy. Two methods 
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can be considered: first to have the same assets with the same weights as 
the index itself (exact replication); or, second to choose a smaller subset of 
assets while minimizing the replication or tracking error for some given crite- 
rion (partial replication). For the second case, an objective function must be 
selected and specific constraints can be introduced, such as bounds upon in- 
dividual weights, number of assets to be included in the portfolio, restrictions 
on transaction costs, etc. 


4.1.1 Tracking error 


Most of the time, the objective function T of the tracking error is defined as 
the variance of the difference between tracking portfolio return Rp and index 
return Rz: 

T =o0(Rp — Rr). (4.1) 

Such criterion is used by Toy and Zurach [493], Connor and Leland [126], 
and Larsen and Resnick [344]. Other criteria, based on absolute deviations 
instead of squared deviations, can also be used, as in Consiglio and Zenios 
[127], Rockafellar and Uryasev [428], and Konno and Wijayanayake [330]. 

Rudolf et al. [443], for example, introduce the following “linear” tracking- 
errors: 


e MAD (Mean Absolute Deviation). 


Let Y € RT be the index return time series. Let X € R”*7 be the 
return matrix of the n securities which are included in the replicating 
portfolio. Let @ € R” the vector of weights to be determined. 


The optimal weights are deduced from the minimization of the sum 
of absolute deviations between index return and replicating portfolio 


return: 
T 
= A i $ 
B rgmin >, ( ) 


t=1 


b> Xubi — Yı 


i=l 








where Xt = (Xit, very Ant): 
e MinMaz. 


The optimal weights are determined against the worst case: 


B= Argon (max | X8 — Yıl) ; 


These two measures of tracking-errors can be extended by selecting only 
the values X+8 smaller than Y; (i.e., Downside Risk). This latter criterion is 
called MADD (“Mean Absolute Downside Deviation”). Similarly, we define 
the DMinMax criterion (“Downside MinMaz”). 

To summarize, we have: 
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TEwap: min Day (|X 8 — Y]) 


TEyapp: min Xi (x = xA”) 


e TE MinMaz : iy |X: = Y;| 


TEDMinMas : minias Mo XB” 


4.1.2 Simple index tracking methods 


Index tracking-errors can be based on statistical methods, such as the coin- 
tegration approach, or on calibration-type algorithms, such as the threshold 
accepting algorithm. 


4.1.2.1 Exact market capitalization weighting 


This method consists of investing in all assets of the index proportional 
to their shares in the index. This perfect replication seems to be the ideal 
method. However, some difficulties may appear: 


e The invested amounts must be sufficiently high in order to avoid round 
weighting problems. 


e Depending on management style, liquidity problems can occur, for ex- 
ample on small-capitalizations. 


e If the index is modified, high transaction costs reduce the performance. 


4.1.2.2 Sratified replication 


This method is based on the decomposition of the index according to a set 
of characteristics which can be weighted. Then, the replicating portfolio must 
have the same decomposition. For example, consider an equity index with i = 
1, ..., n stocks belonging to j = 1, ..., m industrial sectors, and k = 1,...,1 styles 
(small-cap, big-cap, growth, value, etc.). Then, the replicating portfolio must 
have the same respective percentages (p;%,qx%) of sectors and styles, but 
not necessarily the same stocks with the same weights. However, this method 
does not indicate the types and the optimal number of these characteristics. 


4.1.2.3 Synthetic replication 


Derivatives written on the index may also be used (if they exist). However, 
if maturity dates do not coincide, options must be rolled over. This induces 
additional costs since, for example, when options are not always at-the-money, 
it is more difficult to forecast their prices. 
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4.1.2.4 Optimum sampling replication 


Meade and Salkin ([382],[383]) use quadratic programming to determine 
the optimal tracking portfolio weights. However, they use a pre-selected set of 
securities. As mentioned in Tabata and Takeda [486], index fund management 
requires: 


e Minimization of the number of assets in the tracking portfolio. 


e Minimization of a function of the tracking-error between portfolio and 
index. 


They propose an algorithm which determines a locally optimal tracking 
portfolio. 


4.1.3 The threshold accepting algorithm 


Dueck et al. ({171],[172]) have proposed the so-called threshold accepting 
algorithm for the risk and reward optimization problem. Gilli and Këllezi 
[255] examine the performance of the threshold accepting algorithm for in- 
dex tracking. They assume the existence of transaction costs for portfolio 
rebalancing. Their approach is presented in this section. 


4.1.3.1 The model 


Assume that there are (n4 + 1) securities in the index to be replicated. 
Asset 0 is assumed to be the risk free asset with constant rate r. Let pi; be 
the price at time t of asset i, i = 1, ..., nA. 

Let J; be the index value at time t. Its return on time period [t — 1,1] is 


given by: 
E 
r; =ln i 
lı 


Let xj, be the quantity invested on the ith asset in the tracking portfolio 
P, at time t: 





P; = {rit | = 0,1,...,n4}. 
Consider the set of indices corresponding to assets included in portfolio P;: 
Denote by V; the value of portfolio P}: 
nA 
Y= 5 Tit Pit = 5 TitPit- 
w0 te St 
The corresponding weights are given by: 
LitPit 


Wit = . 


Vi 
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Denote also by V;-the tracking portfolio value before t: 


nA 
V- = X Ti t—1Pit- 
i=0 


Without transaction costs, the portfolio return r? on time period [t — 1,t] is 


given by: 
V;- Ay Litas 
rf = In ( t ) =In pe . (4.2) 
Vi-1 pee Piti 


Assume that transaction costs C; are proportional to absolute changes in each 
security 2: 





nA 
C= cX pit |Lit-£it-1l, (4.3) 
i=0 
where c is a given positive real number. 
Then, the portfolio return, taking account of transaction costs, is given by: 


V,- V,- 
r? =In (5) = In | ———= |. 4.4 
: Vi-1 Ve-1)--Ct-1 oe 


4.1.3.2 Objective function 


Given the observations of prices pj, and J; at times tj,...,t2, we search 
for the portfolio which would have replicated as closely as possible the index 
return over the period [t1, t2]. Therefore: 

- First we have to choose a criterion F} 4, which is a function of the tracking 
error. 

- Second, we search for quantities £i, i = 1,...,24, such that the corre- 
sponding portfolio minimizes F;, 4,. As seen previously, several measures can 
be used. For example, the a-norm: 


R] 


t a 

( ten rE = ril ) 
B, =~ (4.5) 
tg — ty 


where a > 0. Gilli and Këllezi [255] use this criterion for a = 1. 


4.1.3.3 Constraints 
Several additional constraints can be introduced. 
- No shortselling: 
Zit 2 0,1 = 0,... NA. 
- Minimum and maximum amount on individual securities: 
TitPit 


Ei SS 
Die LitPit 
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where c; and 6; are exogenous. 
- Upper bound K on the number of assets: 


#{Ii} SK. 

- Constraints on transaction costs: for a given non-negative coefficient y, 
Ct < YVi-. 

Other constraints can also be considered, such as limits on asset groups. 


4.1.3.4 Optimization problem 


Assume that at time t2 we have the portfolio P; = fz } with value Vie 


ity 
We attempt to “optimally” rebalance this portfolio in order to replicate the 
index during the time period [t2, t3] . Thus, we search for the “ideal” portfolio 
Pr = forato unchanged on [t1,¢2], which minimizes the function Fi, 4, of 
the tracking error on the time period [t;, t2]. Therefore, we have to solve: 


min Ft, to 


with 
Ct £ YVi-, 
Dies Lit Dit, + Cty = Viz 
ci S44 _ <6, iE e, 


= Hic Ti, tı Pisti Z 


#1 < K. 


Then, assuming that P% will be also optimal for the time period [t2, ts], at 
time t2 we choose the portfolio P;, such that wit, = Wj +- 


4.1.3.5 Implementation 


The TA algorithm is a local search algorithm which avoids local extrema 
by accepting solutions which are not worse by more than a given threshold. 
This latter one is gradually decreasing and is equal to 0 after a given number 
of steps. The optimization problem can be defined as follows. 

Let f : P — R be the objective function where P is the discrete set of all 
feasible replicating portfolios. Consider fopt the minimum of f over the set 
P: 


fope = minf (x) . 


Consider the set of all optimal solutions: 
Pmin =. {x EP | f (x) = font} > 


The TA algorithm provides either a solution in Pmin, or close to an element 
of Pmin. It is based on the use of a neighborhood function, defined as follows: 
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DEFINITION 4.1 -Let X be the set of all acceptable configurations for 
a given problem. We call “neighborhood” any function N : X — 2*, where 
2* is the class of all subsets of X. 
- A “mechanism of exploration of the neighborhood” is a procedure that 
indicates how we pass from a configuration s € X to a configuration s' € N(s). 
- A configuration s is a local minimum with respect to the function N if 
f(s) < f(s’) for any configuration s’ € N(s). 


For the TA algorithm, at any iteration r, the acceptance of the following 
configuration s’ € N(s) is only based on a function r(s’,s) and a threshold 
T,; 8’ is accepted if r(s’,s) < Tp. 

The sequence (T,.), is decreasing and T, — 0. We start with a given port- 
folio Pp. Then, the procedure indicates how to pass from one configuration, 
which is randomly chosen, to another which is in the neighborhood of the 
previous one. It takes account of the objective function. The process stops as 
soon as the fixed number of iterations is reached (it allows for limitation of 
computation time), or if another given condition is satisfied. 

For example, Gilli and Kéllezi [255] use the following function: 


r(s',s) = f(s‘) — f(s). 
The portfolio P, is determined as follows: an asset index 7, is randomly chosen 
in Jp, and a fixed amount of the corresponding asset is sold and converted 
into number of assets. Next, an asset jı is randomly bought. If the size 
constraints are satisfied by the portfolio Py, asset jı is chosen in Jp,. After 
selling 7; and buying jı, the portfolio constraints are examined. Adjustments 
are made if the constraints are not satisfied. 


REMARK 4.1 The choice of assets i and j are random. However, we 
can search for special probability distributions in order to improve the con- 
vergence. From the theoretical point of view, results about stochastic conver- 
gence have been proved by Van Laarhoven and Aarts [497], and Zhigljavsky 
[512]. Beasley et al. [52] introduce an index tracking method based on genetic 
algorithms. A complete proof of convergence of genetic algorithms is given in 
Cerf [109]. : 


4.1.3.6 Empirical results 


Gilli and Këllezi [255] test the algorithm when the exact solution is known. 
Using different constraints and objective functions, they prove that the TA 
algorithm is an efficient method of solving the index tracking and also bench- 
marking problems. Such a result is also shown for example in [416], both 
for actual financial data and for Garch simulations. To illustrate the efficien- 
cy of the TA Algorithm, consider a simulation of a multidimensional DCC- 
MVGARCH model (see Appendix A). The correlation matrix is estimated 
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from data on the period from February 2002 to May 2004. Each of the 40 
stocks is modelled by a GARCH(1,1) process. The following figures illustrate 
the random behavior of the objective function MADD and its performance. 
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FIGURE 4.1: MADD minimization with ten stocks 


On the first period, the manager applies the TA algorithm to determine the 
replicating portfolio weights. On the second period, the value of this portfolio 
is compared with the index value. The MADD minimization is quite efficient. 
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FIGURE 4.2: MADD minimization: estimation and test 
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However, in order to limit transaction costs, we also have to take care of 
the weighting stability if the manager decides to rebalance the portfolio. To 
illustrate this constraint, consider another DCC-MVGARCH simulation for 
which the manager rebalances the portfolio when one stock weight in the 
portfolio deviates by more than 1.5 percent (w.r.t. the corresponding weight 
in the index). For example, consider an initial invested amount equal to one 
million euros, and a proportional transaction cost of 0.5 percent. Then, we 
note that generally the manager will rebalance the replicating portfolio twice 
(rebalancing times: T,and T>,) for which the differences in the weighting are 
given in next table. The replicating result is quite satisfactory. 


TABLE 4.1: MADD minimization and weighting differences 


ea Initial weighting | Differences in the | Differences in the 
weighting at Tı woe at To 
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FIGURE 4.3: MADD minimization with constraints 
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4.1.4 Cointegration tracking method 


Nowadays, the notion of cointegration is widely applied in time series anal- 
ysis with applications to macroeconomics and finance. Practitioners have 
recently begun to use this method for indexed fund management or detection 
of statistical arbitrage (see Alexander and Dimitriu [15] and Dunis and Ho 
[179]). 

Initially, the financial modelling used the analysis of return correlation co- 
efficients. Correlation and cointegration are two related but distinct notions: 
correlation measures the short term dependency of two return series, while 
cointegration measures the long term dependency. Indeed, the correlation 
analysis implies dealing with stationary time series. Therefore, data series 
must be transformed into stationary series by deleting the trend or by dif- 
ferentiating them (see Granger and Joyeux [263]). Consequently, common 
stochastic trends between securities are ignored. The cointegration approach 
can use more information. Thus, this method can better describe long term re- 
lations. It is also powerful since it allows for the use of simple statistical meth- 
ods such as least mean squares regressions, in order to study non-stationary 
processes. 

In this section, some basic properties of cointegration are discussed (for 
basic statistical properties, see Greene [267]). 


4.1.4.1 Tests of unit roots 


To determine the type of non-stationarity, we have to introduce stationary 
tests (see Appendix A for basic statistical definitions and properties): 


e Deterministic trend: function of time (t, t?, log (t) ,..-). In this case, the 
mean is increasing (or decreasing) but the variance is constant. 


e Stochastic trend: the process X is such that 


Xı = ao + a.t + c, with e ~ ARMA. 


Such process is stationary “around its mean.” This stochastic trend is 
due to the existence of a unit root in the autoregressive process X: 


Xi = do + Xt_-1+&, with e ~ ARMA. 


Then, its variance may explode with time and its perturbations are 
persistent. We have to differentiate it to get a stationary time series: 


AX; = bo + Bit + ($ — 1)Xi-1 + Er, with e, ~ WN (0, 0°). 
The stationary test (called the Dickey-Fuller test) is the following: 


e Null hypothesis: (Ho) : ¢ — 1 = 0, 
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e Alternative hypothesis: (Hi): ¢—1 <1. 


If assumption H; is rejected, then the process is considered non-stationary. 
Coefficients 3) and bo allow us to check if there exists a deterministic/stochastic 
trend. A special statistic has been introduced by Dickey and Fuller [180] (see 
also Davidson and Mac-Kinnon [178]). However, this test is limited since it 
assumes that the noise is a white noise. Thus, we have to use the “Augmented 
Dickey-Fuller” (ADF) test. 

Additional lagged variables are introduced: 


Pp 
AX; = Bo + Britt (@-1X-1+ 5 A;AX_; +e, with e ~ WN(0, 0°). 


j=l 


The number of lags p for the variable AY;_; is chosen such that £ is a white 
noise. 


REMARK 4.2 Fora stationary time series, we have ¢ = 0. When d roots 
exist, we call I(d) the time series X. It must be differentiated d times in order 
to get a stationary series (see Granger [264]). Note that financial time series 
are generally such that d < 2. 


4.1.4.2 Cointegration definition 


Consider two time series (X+); and (Y;)+. If they are both J(1), any linear 
combination of these two variables is generally not stationary. However, in 
some cases, this combination is [(0), which means that it is stationary. In 
that case, the two series (X+); and (Y;); are said to be cointegrated: 


Y; = a + b.Xı + Z where Y; ~ I(1), Xa ~ I(1) and Z is stationary. (4.7) 


This means that these two series have similar trends such that, for a specific 
linear combination, their trends can be compensated to provide a stationary 
time series. If Z; is strongly stationary, then the difference Y; — (a +b. X+) has 
always the same probability distribution. There exists a “long term equilibri- 
um” between the two series. 


From relation (4.7), to test the cointegration property, it is sufficient to test 
the stationarity of the residual terms z. 


This can be done by using Dickey-Fuller tests such as: 


Ax = (Y — 1L) + re. 
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4.1.4.3 Tracking portfolio determination 
This problem is solved by using the following three steps: 


e First step. We have to select a given subset of assets, either by means 
of, for example, “stock picking,” or by statistical methods. This step is 
essential to get a significant property of cointegration, which determines 
the quality of the tracking method. It cannot be based on cointegration. 
The manager has to test different securities combinations to be included 
in the replicating portfolio. In particular, he must select the number of 
assets. The higher this number, the more stable the cointegration (in 
the absence of costs). 


e Second step. The logarithm of the index value is regressed by least 
squares estimation (LSE) on the logarithms of the previously selected 
asset prices: 


n 
log (Ir) = c1 + XC egy log (Pht) + €t. (4.8) 
k=1 

Note that by the logarithm transformation, time series are more homo- 
geneous. If time series are cointegrated, then their logarithms are also 
cointegrated. The residuals are stationary if and only if log (I) and the 
tracking portfolio Xp; cx. log (Px) are cointegrated. The coefficients ck 
are the weights of the tracking portfolio. In order to have a cointegra- 
tion relation, the residual term € in relation (4.8) must be (0) and all 
the asset prices in the tracking portfolio are I(d) with d > 1. Otherwise, 
the coefficients cz, determined from LSE, are not efficient and may be 
fallacious. 


e Third step. We have to estimate the following dynamic of index return: 


A log (i) = rat > and log (l-n) + 5 TŁA log (Ppt) tus. (4.9) 
h=1 kECI 


This king of regression is an “error correction model”: it corrects short 
term effects. 


The coefficient y models the speed of mean-reverting to the long term 
value, given by the cointegration relation. It be must non-positive. Co- 
efficients Ty, are normalized such that nn Tk = 1, since they represent 
the weights of the tracking portfolio. 


4.1.4.3.1 First step: causality To determine the assets to introduce in 
the tracking portfolio, the notion of causality between the tracking portfolio 
and the index can be introduced. This notion has been considered in Granger 
[262]. A series induces causality on one another “if the knowledge of the first 
one improves the forecast of the second one.” 
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Granger causality test. This is based on linear regressions. 


DEFINITION 4.2 Let X be a multivariate process X+ = (X14, AX ra) g 
The random variable X; does not cause Xp if and only if, at any time t, the 
knowledge of the past of Xj, Xj4-1,does not improve the forecast of Xk t+H, 
for any time horizon H: OO 


























EL (Xen /Xi1;1 <i <n) =EL (Xeern/Xja-1) (4.10) 











where EL (./.) is the linear regression operator, and Xj 4-1 is the set of random 
variables {Xi1-1;1 # j} 








In practice, for any asset i, consider the following regression on the rates of 
return: 


k k 
ri = ao + 5 Con, + 5 BN +e. (4.11) 
p=1 pel 
To determine the optimal value of the number k of assets, the Akaike criterion 
(AIC) [11] can be used: 
L k 
AIC = —2(=) + 2— 4.12 

(+24, (4.12) 
where L is the likelihood, and T is the number of observations. The Granger 
test on Equation (4.11) for any asset i, are based on a Fisher test on the joint 
hypothesis: 








Sims causality Another definition of causality is given by Sims [471]. It 
is based on impulse analysis. A time series causes another one if a random 
shock on the first one has an impact on the other one, especially on the 
variance of its forecast errors about its future values. To characterize the 
Sims causality between two stationary time series, consider the Wold canonical 
decomposition: 


Xi = (Mis Xat) =O (Ljet = CDi a en)s (4.14) 


where L is the lag operator, C(0) = Ig, and ez is the canonical innovation of 
X+ which has the variance-covariance matrix © = (a;;), eigen 














= Xe EL (Xin /Xit-1) l (4.15) 


The random variable X; can be decomposed on two moving averages, accord- 
ing to: 


Xie = X Cy (L) Eje- (4.16) 
j=1 
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According to Sims, at any time t, past shocks on variables X; have an impact 
on Xit, through the variable C;;(L)ejt. This approach allows us to measure 
the influence of different impulses on variables. Therefore, the manager can 
select securities to be included in the tracking portfolio by using two causality 
criteria: the first one based on forecast improvement, the second on impulse 
reaction. 


4.1.4.3.2 Second step: cointegration tests The first method, due to 
Engle and Granger [205], uses the following approach. We begin to search for 
a long term relation between a dependent variable and explicative variables. 
Then we have to introduce an error correcting model for the cointegrated vari- 
ables. Nevertheless, this method assumes prior relations and special causality 
between the variables. 


The second method uses the maximum likelihood, introduced by Johansen 
[304]. Since this method uses a multivariate model, it allows for differentiating 
of several cointegrated vectors. Stock and Watson [481] prove that under the 
cointegration hypothesis, (i.e., € is stationary), the estimators of the coeffi- 
cients c; in Equation (4.8) have a very fast speed of convergence: it is equal 
to T and not to VT, as is usual. 


4.1.4.3.3 Third step: stability tests After the estimation of the track- 
ing portfolio weights, the fund manager faces the stability problem of these 
coefficients. One possible stability test is the “Cumulative Sum Test” defined 
by Brown et al. [92]. 


From Equation (4.9), consider the recursive residual terms w+ defined by: 
(ye — xb) 


5 (4.17) 
(1 + a, (ALA)? ze) 


Ww = 


where xi is the observation vectors at time t, A; the regressor matrix from 
time 1 to time t, b the vector of regression coefficients, and ys — Aib the value 
of forecast errors. Its variance is given by: 


o? (1 +2) (ALA) wr) (4.18) 


The CUSUM test is based on the statistics: 
t 
Wr 
Wi = — . 
TODE (4.19) 
r=k+1 
where wr is the vector of recursive residual terms, and s is the standard 
deviation of the regression on the whole sample. If the tracking portfolio 
weights are constant, then E(W;) = 0. 
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REMARK 4.3 Empirical tests are mitigated. For example, Alexander 
and Dimitriu [15] compare properties of tracking portfolios based on coin- 
tegration and those based on tracking error quadratic minimization. They 
conclude that there is no significant advantage to use cointegration. Howev- 
er, on other financial data, Dunis and Ho [179] show that the cointegration 
method reduces the number of rebalancing times. In [417] and [487], such 
analyses are made on the main French stock index CAC40. The cointegration 
approach appears to be an efficient method. I 


4.2 Benchmark portfolio optimization 


As seen in Chapter 3, asset allocation can be based on mean-variance op- 
timization. This allows for the determination of a set of asset class weights. 
This can induce a “long-term” investment plan, often called the portfolio’s 
strategic asset allocation. It can be viewed as a benchmark for portfolio opti- 
mization. Indeed, the choice of the benchmark is crucial. 


However, typically, the portfolio manager has the task of “beating” the 
benchmark. He may have short-term forecasts which deviate from the long- 
term forecasts associated to the benchmark. In that case, market timing (ac- 
tive asset allocation) may lead to a divergent portfolio choice. This approach 
is usually called tactical asset allocation. Another example of such diver- 
gence from an index or a benchmark is the core/satellite approach: according 
to some specialists, an investor would invest on one hand on an index fund 
which is considered as the “core” portfolio, and on the other hand choose 
a skillful manager to add value. This second investment usually called the 
“satellite” is based on active management. 


But, generally, the portfolio manager must choose a portfolio that does 
not deviate too much from the benchmark. The tracking-error between the 
portfolio and the benchmark must be controlled. As seen in previous section- 
s, several objective functions can be introduced to measure the risk of the 
tracking. The most common function is standard deviation. Note that for 
benchmark portfolio optimization, the tracking error is not necessarily min- 
imized, as it is for index fund management. For instance, it can be fixed 
at a given “rational” level and other decision criteria can be introduced to 
determine optimal solutions. 
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4.2.1 Tracking-error definition 
4.2.1.1 Tracking ex-ante and tracking ex-post 


The tracking error is defined as the difference between portfolio and bench- 
mark returns. It represents a relative risk of the portfolio with respect to its 
benchmark. However, as mentioned by Pope and Yadav [411] and Satchell 
and Hwang [448], the tracking error can be tricky: 


e First, the tracking-error can be “anticipated.” The risk is measured ex- 
ante (tracking ex-ante). It is a statistical measure. It is a measure of 
riskiness w.r.t. the benchmark. 


e Second, the risk is measured on “realized” risk (tracking ex-post). This 
measures the time series standard deviation of the realized active re- 
turns. 


For a tracking ex-ante, the fund manager chooses an allocation different 
from the benchmark’s in order to get a better return expectation than the 
benchmark. To do so, the anticipated function of the tracking, such as stan- 
dard deviation, may be significantly above 0. For example, for an ex-ante 
tracking error of 2%, the portfolio returns will fall within +/ — 2% of the 
benchmark with a two-thirds probability. At maturity, ideally, the fund man- 
ager will have succeeded in having a realized return at a given level r% above 
the benchmark, while the tracking error was almost constant. Unfortunately, 
the tracking error is often underestimated. Pope and Yadav [411] indicate 
that autocorrelation of excess returns is a reason for this underestimation. 
Satchell and Hwang [448] argue that ex-ante and ex-post tracking errors d- 
iffer when the realized benchmark volatility is high. Haar and van Straalen 
[279] confirm that ex-post tracking errors are higher than predicted because 
of changes in volatility, but also because of the concentration in systematic 
risk factors in a portfolio. 


4.2.1.2 Tracking-error, correlation and beta coefficients 


We first must distinguish the tracking error from the correlation coefficient 
and the beta coefficient: 


e The correlation coefficient (p) measures a “linear risk” between the port- 
folio return Rp and the benchmark return Rg. For a given standard 
deviation op of the portfolio return, the correlation coefficient decreas- 
es when the specific risk increases. Here, the specific risk indicates the 
portfolio risk which is independent from the benchmark return. 


e The beta coefficient (3p) measures the risk exposure to the benchmark. 
For example, if Gp > 1, then the fund manager incurs a systematic risk 
w.r.t. the benchmark. 
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If the objective function on the tracking error T is the variance, denoted by 
T?, then we have: 


T? = o° (Rp — Rg) = of + ob — 2po gop, 
= o? +0} (1—26p). (4.20) 


Therefore, the tracking-error volatility (TEV) is a function of the total 
portfolio volatility, of its exposition to the benchmark, and also of the total 
benchmark volatility. As a consequence, if we focus on excess returns while 
including benchmark securities, the total risk of the portfolio may be high. 


REMARK 4.4 The Sharpe market model allows for the introduction of 
the relation between the tracking-error, the systematic risk Gpog, and the 
specific risk Cep. 

This model is: 
Rp = ap + bp.Rg + ep. (4.21) 


Thus, we have: 





Rp— Rg =Qap + (Bp —1).Re +ep, (4.22) 


where ep has a standard Gaussian distribution. Then, we deduce: 
T? = (Bp — 1)? 02, + o2p. (4.23) 


U 


4.2.2 Tracking-error minimization 


Several portfolio optimization programs can be introduced to control the 
tracking-error, depending on what kind of objective function is used. Specific 
constraints are also often introduced. In what follows, the volatility of the 
tracking-error is considered, since it is the basic criterion. 

Roll [431] has studied two optimizations programs under a constraint on 
the tracking-error’s volatility: minimization of the tracking-error volatility 
(TEV), and minimization of TEV under a beta constraint. 


4.2.2.1 TEV minimization under a mean-return constraint 


Consider the same notations as in Chapter 3 for the mean-variance analysis, 
in particular the definitions of A,B,C, and D (see 3.1.2.1). In addition, 
denote: 


b: the benchmark weight vector 

x=(w — b): the vector of differences between the weights of the portfolio 
and those of the benchmark 

Rp = b'R: the expected benchmark return 

o? = b'Vb: the variance of the benchmark return 


120 Portfolio Optimization and Performance Analysis 


The tracking-error volatility (TEV), denoted also by T, is: 


T = 4/ (w —b)’ V (w — b) = Vx’Vx, (4.24) 


where V is the variance-covariance matrix of asset returns. 
The first optimization problem is the following one: 


min (w — b)’ V (w — b) 
P(1):4 with (w-bY R=G, (4.25) 


w'e =1. 


The term G is the excess expected return of the portfolio w.r.t. the bench- 
mark. Roll [431] proves the following result. 


PROPOSITION 4.1 
1) The optimal solution is given by: 


w=b+ D (qı - qo), (4.26) 


where G R 

= => and = y=. = a S 4.27 

R- R qo qı A ( ) 

The parameter coefficient D can be viewed as a relative performance coef- 

ficient. The portfolio qo is the mvp, and qı is also a mean-variance efficient 
portfolio with: 


al w 


— A — B 
Ro = & ar 

1 B 
00 a On = aa: 


2) The tracking-error variance and the total variance of optimal solutions 
from Problem P(1) are given by: 


T? = D? (of — 0), 


o’ = 0h +T? + 2Do% (Re/Ro — 1). (4.28) 


REMARK 4.5 Optimal portfolios given in relation (4.26) are the sums of 
the benchmark and of a deviation term. This one depends only on two special 
mean-variance efficient portfolios and on the anticipated excess return, which 
is independent from the benchmark. The tracking-error volatility (relation 
(4.28)) is an increasing linear function w.r.t. the anticipated excess return, 
which does not depend on the benchmark. i 
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DEFINITION 4.3 The set of optimal solutions associated to Problem 
P(1) is generated by the variation of the excess return G. In risk-return 
space, this set is called the relative frontier. 


REMARK 4.6 Most of the time, the benchmark is not mean-variance 
efficient, as empirically shown by Grinold [272]. If the benchmark is efficient, 
then all optimal solutions of Problem P(1) are also mean-variance efficient. 
Otherwise, they are not. 


Portfolio solutions of Problem P(1) have two shortcomings: 
e First, generally they do not mean-variance dominate the benchmark. 
e Second, their beta coefficients w.r.t. the benchmark are higher than 1. 


These properties are examined in what follows. 
A) Mean-variance dominance. 


- Case 1: Rg > Ro. Solutions of Problem P(1) do not dominate the bench- 
mark. This case is illustrated by the following figure. 
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FIGURE 4.4: Efficient and relative frontiers, Rg > Ro 


In that case, the benchmark has an expected return above the mvp expected 
return. Then, if the excess return of a portfolio w.r.t the benchmark is non- 
negative (G > 0), the portfolio total risk is higher than the benchmark total 
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risk (see relation (4.28)). If a riskless asset is introduced, such property is still 
satisfied for benchmark having an expected return higher than the riskless one. 
Therefore, for this first case which is the most usual, none of the solutions of 
Problem P(1) mean-variance dominate the benchmark (which also does not 
dominate them). 


- Case 2: Rp < Ro. Some of the optimal solutions dominate the bench- 
mark. For small tracking error volatility, the total risk is smaller than the 
benchmark total risk. However, for a higher excess return G, optimal solu- 
tions no longer dominate the benchmark. These properties are illustrated by 
the following figure, which includes both mean-variance and relative efficient 
frontiers. 
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FIGURE 4.5: Efficient and relative frontiers, Rg < Ro 


Indeed, the term D measures a relative gain when taking a sufficient tracking- 
error in order to generate an excess return. When D is small (the portfolio is 
close to the benchmark), a first-order Taylor’s expansion leads to the following 
approximation: 

Op ~ 03 +2D(Rp/Ro —- 1). (4.29) 


The portfolio total risk 0% is smaller than the benchmark total risk 0%. 


This explains why the portfolio manager gets a portfolio which dominates 
the benchmark (according to mean-variance criterion). 
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B) Beta coefficients w.r.t. the benchmark. 

The beta coefficients of optimal solutions w.r.t. the benchmark are given 
by: 

Ge SS 
Bp=14+D (=) (Rp/Ro — 1). (4.30) 
B 

Consider the usual case when the benchmark has an expected return higher 
than the expected return of the mvp (Rg > Ro). Then, Bp > 1. Therefore, 
the systematic risk w.r.t. the benchmark is significant, as soon as the portfolio 
manager has a benchmark-timing strategy. 


4.2.2.2 TEV minimization under mean-return and beta constraints 


In order to mitigate the previous drawbacks, Roll [431] proposes a relative 
optimization problem with an additional constraint on the portfolio beta. For 
a given portfolio exposition to the benchmark (the beta), we can search for 
portfolios which minimize the tracking error. The interesting point is to get 
optimal solutions with beta smaller than 1, according to the given constraint. 

The new optimization problem is: 


min (w — b) V (w — b) 
with (w-—- bY R=G, 
w'e =1, (4.31) 
w Vb 





2 
B 


PROPOSITION 4.2 
The solution of Problem P(2) is given by: 
w* =b+vb+ V~ (yR + ðe), (4.32) 


where the Lagrange multipliers are given by: 





G (1 - Cog) - o} (8 — 1) (B - CRB) 





1= (A— BRp) (1 — Cob) — (B CRp) (Rg — Bos)’ (4.33) 

_ œ (8-1) (A- BRz) -G (Rsg — Bod) 
= (A— BRp) (1 — Co3,) — (B — CRp) (Rp — Bod)’ (4.34) 
y= —qB — ôC. (4.35) 


Roll |431] proves that the optimal solution is a convex combination of the three 
portfolios qo, qi, and the benchmark b. 


The following figure presents three “beta frontiers.” 
Note that for some fixed beta values (for example 8 < 1 ), some solutions 
of Problem P(2) mean-variance dominate the relative frontier associated to 
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FIGURE 4.6: Efficient, relative, and beta frontiers 


Problem P(1). Nevertheless, these portfolios have tracking-error volatilities 
higher than those of the relative frontier. 


REMARK 4.7 By fixing beta values smaller than 1, the benchmark- 
timing risk is limited. However, for 8 Æ 1, the beta frontier does not contain 
the benchmark, which may be a drawback. 

[ 


4.2.2.3 Mean-variance optimization under TEV constraint 


As mentioned by Roll [431], excess return optimization leads to optimal 
portfolios with systematically higher total risk than the benchmark. Thus, 
we can search for a criterion which controls both the total risk and the rela- 
tive risk (TEV). 


For example, Bertrand et al. [61] consider that investors (or fund man- 
agers) maximize a mean-variance criterion under a tracking-error volatility 
constraint. In this framework, the manager has to solve the following opti- 
mization problem: 


max wR — w Vw 


P(3): < with (a — b) V c —b)=T?, (4.36) 
w'e =1. 


The parameter ¢ is the marginal substitution rate between the return and 
the variance. It can be viewed as an aversion w.r.t. the variance. In what 
follows, we call it the “variance aversion.” 
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PROPOSITION 4.3 
The optimal solution is given by (see Bertand et al. [61)): 


we =b+ 75 (—¢b + V~! (R — pe)) 











E TE as 
p= Set. 


REMARK 4.8 The set of portfolio solutions of the previous Problem 
P(3) is the same as the set of portfolio solutions of Roll program, as soon 
as the expected rate of return of the benchmark is greater than that of the 


minimum variance portfolio. 


For a fixed variance aversion ¢, the ¢-frontier associated to Problem P(3) 
is generated when the TEV is varying. 
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FIGURE 4.7: Efficient, relative, and ® frontiers 


When the variance aversion @ increases, the ¢-frontiers move to the left 
(area of portfolios with small total risk). If the variance aversion is high, 
the portfolio manager chooses portfolios with a total risk smaller than the 
benchmark total risk. If the variance aversion goes to infinity, then the optimal 
portfolio converges to the mvp, if the TEV is sufficiently high. For some values 
of ọ (for example ¢ = 2.5), some optimal portfolios mean-variance dominate 
the benchmark. Note also that all these frontiers contain the benchmark. 


126 Portfolio Optimization and Performance Analysis 


REMARK 4.9 When ¢ = 0, the set of optimal solutions P(3) is equal 
to the set of solutions of Problem P(1). The relative frontier corresponds to 
investors who are “neutral” w.r.t. the total risk (p = 0). Nevertheless, we 
have to choose an aversion risk level. For example, we can consider values of 
@ such that there exists a tangency point between the ¢-frontier (solution of 
P(3)) and the mean-variance frontier. An optimal portfolio can be chosen on 
that curve, according to the TEV. The set of optimal solutions of Problems 
P(2) and P(3), which have mean returns higher than the mvp return, are 
equal: any solution of P(2), associated to a pair (G, 3), is equal to one and 
only one solution of P(3), associated to a pair (¢, T). 1 


Example 4.1 
Consider a fund manager who chooses a TEV equal to 3.00%. For the param- 
eter values considered in this example, we have: 
1) For ¢ = 2.4145, the portfolio solution of Problem P(3) has the following 
characteristics: 
Standard deviation Mean TEV Beta — 
18.00% 9.70% 3.00% 0.975 


This portfolio mean-variance dominates the benchmark. Its beta is smaller 
than 1. Therefore, this portfolio is closer to the benchmark than the corre- 
sponding portfolio belonging to the relative frontier. 

2) Consider now the portfolio belonging to the relative frontier with the 
same TEV. Its characteristics are: 


Standard deviation Mean TEV Beta 
19.24% 10.02% 3.00% 1.045 


The fund manager may choose to lose 0.32% in the mean return in order 
to reduce the total volatility (—1.24%) while also reducing the exposure beta 
(from 1.045 > 1 to 0.975 < 1). 

3) We can also consider the TEV-frontier with TEV=3.00% (the constant- 
TEV-frontiers are ellipses in the risk/return space). Then, we can select the 
portfolio belonging to this curve with the same total risk as the benchmark. 
Its characteristics are: 


Standard deviation Mean TEV Beta 
18.21% 9.787% 3.00% 0.986 


Contrary to the solution of Problem P(3) with ¢ = 2.4145, this porfolio 
does not mean-variance dominate the benchmark. 

The following figure shows the comparison between the three criteria and 
the associated solutions. 1 
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FIGURE 4.8: Frontiers and iso-tracking curves 


4.3 Further reading 


As examined in Sorenson et al. [474], investors and fund managers have to 
allocate between active and passive management. 


Strategic allocation consists of defining the orientation of the investment: 

By determining the risk aversion and horizon in order to choose the bench- 
mark; 

By choosing the portfolio composition: asset class allocation among cash, 
fixed-income products (domestic or international) and stocks (by sectors or 
styles), alternative investments, etc; and, 

By fixing the tolerance of the fund manager with respect to the benchmark. 


Tactical allocation is more concerned with the effective management process. 
It consists of searching for the best way to achieve the previous goal: 

By choosing the weighting asset class with respect to the benchmark. This 
can be done by market timing (which induces the determination of the expo- 
sure to the benchmark), and by asset allocation among sectors, lands, etc. 

By using asset valuation models. 
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Main Index tracking methods are based on statistical replication which goes 
back to the 70s. Many articles have examined this method. Rudd [441] ex- 
amines the selection of passive portfolios. Rudd and Rosenberg [442] consider 
portfolio optimization problems and their realistic implementation. Bamberg 
and Wagner [40] search for robust regression estimators to replicate equity 
indexes. Dash et al.[148] attempt to measure the efficiency and costs of such 
portfolio optimization. 


Evolutionary algorithms are appropriate candidates for being able to con- 
tinuously track the movement of the optimum through time. In order to solve 
large scale optimization problems with constraints, “metaheuristic” method- 
s have been introduced, e.g., by Derigs and Nickel [163] for index tracking. 
They are based on neighborhood methods and evolutionary algorithms such 
as genetic algorithms. These latter methods are also used in other applica- 
tions in finance, as shown by Bauer [51]. Conditions insuring the convergence 
of the 

threshold-accepting algorithm are also provided in Althéfer and Koschnick 
[21]. Applications of this algorithm in econometrics is provided in Winker 
[505]. 


An overview of some different benchmark portfolio optimization procedures 
is given in Wagner [500] (see also Wagner [499]). Other references are: Franks 
[241] and Clarke et al. [124] who study tracking errors and tactical asset al- 
location, and, Jorion [310] who proves that the set of tracking-error volatility 
constrained portfolios is an ellipse on the mean-variance plane. 


Gaivoronski and Krylov [250] present several portfolio selection algorithms 
in a dynamic setting which take into account different risk/target measures. 
They search for the best tradeoff between tracking precision and rebalancing 
frequency in order to control costs. 


Chapter 5 


Portfolio performance 


Competition among financial institutions has led to the performance analysis 
which examines the qualities of managers who use active investment strate- 
gies. The investment process is overall evaluated. 


The fundamental question is: 


Does a given manager provide an actual value-added service with respect to 
a simple index replication or to a given benchmark? 


As a by-product, the performance analysis is a test of the market efficiency 
which is based on different kinds of information, as introduced by Fama [217]. 

An active manager tries to take opportunities from particular information 
(public or private) which may not be reflected by market prices. 


To analyze a manager’s performance, it is necessary: 
y P ; y 


e First, to introduce specific performance measures, taking account of d- 
ifferent types of risk. Therefore, several risk measures can be considered 
in order to evaluate the risk associated to the performance. 


e Second, to attribuate this performance: 


- What are the exact contributions of each investment decision to the 
overall portfolio performance? 

- Is the manager skillful or lucky? 

- Compared with the benchmark, does the performance come from stock 
picking or from market timing? 


e Finally, to examine the performance consistency? More precisely, is past 
performance a good indicator of future performance? 
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5.1 Standard performance measures 


In order to ensure that fund comparisons are legitimate: 


e First, funds must have the same nature. 


e Second, a standardization of performance measures and their results 
must be introduced. 


The AIMR (Association for Investment Management and Research) provid- 
ed such standards, further developed by the GIPS (Global Investment Perfor- 
mance Standard). They focus mainly on equities and fixed-income securities. 
Generally, two measures are considered: a measure of total risk, such as the 
standard deviation, and a measure of market risk, such as the beta. The first 
one is “absolute,” the second one is “relative.” 


For the latter measure, we have to evaluate the sensitivity of securities to 
the market. This can be done by using the fundamental CAPM introduced 
by Sharpe ([462],[463]). 


5.1.1 The Capital Asset Pricing Model 


Suppose that there exists a riskless asset with return Rr. As seen in Chapter 
3, under the Markowitz assumptions, the two-fund separation theorem is valid: 
any efficient portfolio P is a combination of the riskless asset and of the market 
portfolio M, which corresponds to the point of tangency between the two 
efficient frontiers (with and without the riskless asset). 

Then, we have: 


Rp =cRe+(1—2)Ru and Rp — Ry = (1-2) (Ru — Rp). (5.1) 


The choice of x depends on the risk aversion. Therefore, its variance is 
equal to: 


op = (1 = x)OM- (5.2) 
Consequently, 
= Ru —R 
Rp = Ry top (=) . (5.3) 
OM 


In the Markowitz plane (a(R), R), this equation defines the efficient frontier, 
which is also called the capital market line. 

Following Sharpe’s demonstration, at equilibrium the prices of assets are 
such that the market portfolio is made up of all assets in proportion to their 
market capitalizations (in practice, a stock exchange index). Then, consider 
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the portfolio P; (not necessarily efficient) with a proportion x invested on 
asset i, and (1 — x) invested on M. We have: 


Rp = x£Ri + (1 = x)Ru, (5.4) 


op= [2? +02 + (1 —2)?02, + 22(1 — 2)oim]/”. (5.5) 


Consider now the curve of all possible portfolios P; when x is varying. The 


coefficient of the tangent to this curve at the point associated to x is given 
by: 
ORp = ORp/dx = (Ri TS Rm) OP (5 6) 
sp Oop/Ou  x(0? +07, — 2oim) + oim — 0%) : 


Since for the market portfolio M we have x = 0, we deduce from Equation 
(5.3) that: 


(Ri — Ru) om _ (Ru — Ry 
eee Nee 


CiM — on 
which is equivalent to: 
Ri — Ry = B,(Ru — Ry) with B, = cim /0f4. (5.7) 


The coefficient beta represents the systematic risk which is due to exposition 
to the market variations. In the plane (Bm(R), R]), the previous equation 
defines a straight line, the so-called security market line. At equilibrium, all 
assets (thus all portfolios) are located on this line. 


Mean-return 


Security market line 


Rm Market 





Beta 


1 
FIGURE 5.1: Security market line 


Note also that, at equilibrium, the market portfolio is optimal, which is 
in favor of passive management based on index funds. As shown by Black 
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[73], the CAPM is still valid without the riskless asset, which is replaced by 
a zero-beta portfolio Z: 


R: — Rz = B,(Ru — Rz) with bi = cim /o},. 


Taxes can also be taken into account, as in Brennan [87]: 


E = Ti-T, 
Ri — Ry = TDi — Ry) = b, (Rm — Rf- T(Du — Rẹ)) with T = E 
g 


where Ty denotes the average taxation rate for dividends, T, denotes the 
average taxation rate for capital gains, D; and Dm denote respectively the 
dividend yields of asset i and market portfolio M. 


The CAPM highlights the relationship between the excess mean return (the 
“reward” ) and the exposure coefficient beta (the “risk”). Other factors can 
also be introduced to estimate the excess return, as shown in the next section 
on performance decomposition. 


5.1.2 The three standard performance measures 


These performance measures are based on the previous properties: Relation 
(5.3) which defines the capital market line and relation (5.7) which defines 
the security market line. The higher these performance measures, the more 
interesting the portfolio. 


5.1.2.1 The Sharpe measure 


This measure is based on the the capital market line. For all efficient 
portfolios, we have the following equality between (excess reward) /(total risk) 
ratios: 

Rp—Ry Ru—-R 
ells SOM alee (5.8) 
oO (Rp) Oo (Ru) 

The previous ratio is the slope of the capital market line. As for Markowitz 

portfolio optimization, we search for such portfolios, since this is equivalent 


to the maximization of the ratio RS = “yt 
[464], this can be actually considered as a performance measure. It is equal to 
the excess mean return (w.r.t. the riskless asset) and the measure ot total risk 
(the standard deviation), and is defined as the “reward-to-variability ratio.” 
For example, the manager can check if the excess mean return of the portfolio 
is sufficient to compensate a higher risk than the market portfolio. If the 
portfolio is well-diversified, its Sharpe ratio is close to the market portfolio’s 
(see also comments in Sharpe [466]). 


. Thus, as proposed by Sharpe 
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5.1.2.2 The Treynor measure 
The Treynor’s ratio [494] is directly based on the CAPM: 


_ Rpe—Ry 
pe 


It can also be viewed as a reward-to-risk ratio where the “risk” is the ex- 
position to market risk. At equilibrium, this ratio is constant and equal to 
Rm-R f- The Treynor ratio allows us to evaluate the performance of a well- 
diversified portfolio, since it only involves the systematic risk. It can be used 
to examine performance of portfolio which is only a part of the investor’s 
assets. Due to previous diversification, the investor takes care only of the 
systematic risk. 


RT (5.9) 


5.1.2.3 The Jensen measure 


This measure is also based on the CAPM, but mainly when it is not satisfied. 
Indeed, consider the difference ap between the mean excess return of portfolio 
P and the return explained by the CAPM. Then: 


Qp = (Rp — Ry) — Bp (Ru — Ry) : (5.10) 


Consequently: 
Rp — Rye =ap+ Bp (Ru — Ry). (5.11) 


The coefficient ap is called the performance measure introduced by Jensen 
[300]. Therefore, the manager searches for portfolios with ap > 0. For ex-post 
measures, the manager will be judged skillful if the coefficient ap is signifi- 
cantly above 0. 


If ap = 0, the return of portfolio P is at equilibrium and the manager’s 
forecasts have not beat the market performance: the portfolio has the same 
alpha as any combination of the riskless asset and the market portfolio. Unlike 
the Sharpe and Treynor ratios, the Jensen measure contains the benchmark 
itself. As with the Treynor ratio, it only takes account of the systematic risk. 
Due to the particular form of alpha, only portfolios with the same risk beta 
can be compared. Otherwise, we can consider, for example, the Black-Treynor 
ratio: 


ap 
RBT = —. 
Bp 


REMARK 5.1 Previous ratios are defined ez-ante. However, ex-post 
formulas can be also introduced, as shown by Jensen [300]. Consider the 
market model: 


Rpt = yp + BeRuit Ept, (5.12) 
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where €p; has zero expectation, is independent from Rmt, and: 
Cov(ept, eps) = 0, Vt Æ s. 
The return expectation of portfolio P is given by: 
Rp =p + PRm. 


Thus: E = 
Rpt = Rp + Bp (Rut — Rm) + Ept. 
Finally: 
Rp; — Rf =ap+ Gp (Rut — Ry) + Ept. (5.13) 


Assume that Ry and p are constant. Then, we can get an unbiased estima- 

tion of the coefficient ap by using a least squares regression. The estimators 

of the unknown parameters ap and p are given by 

3 Cov (Ru — Ry, Rp — Ry) 

SS —E ee 
ao? (Rm — Ry) 

ap = (Rp = Ry) — Bp (Ru a Ry) , 


pi 


where the symbol X denotes the empirical mean of the sample which contains 
T observations of the random variable a. The significance of the estimator âp 
can be based on a Student test on the hypothesis Ho : âp = 0. The ex-post 
Sharpe and Treynor ratios are given by: 
~ Rp-R Bie. Rp-R 
RS = 2—1 and RT = F, 


> Rp) i (5.14) 


REMARK 5.2 Jensen [300] considers the following beta specification: 











Ppi = E [Gp] + Ep, 

















where €p; is a white noise. The manager has a mean beta target E [8p]. In 
this framework, a particular estimation model of the alpha coefficient ap is 
introduced (see [300]). The manager can adjust the beta from the global 
market forecasts. | 
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5.1.2.4 Performance measures comparison 


From Equation 5.11, relations between all these measures can be deduced. 


PROPOSITION 5.1 
(Relations between Sharpe, Treynor, and Jensen measures) 


- The Sharpe ratio is also equal to: 
_Rp-Ry __ ap, p (Rp, Ru) 
= I = PA elt 
o(Rp)  a(Rp) o (Rm) 
For any efficient portfolio P, we have: p (Rp, Rm) = 1. Therefore, we get: 


__ ap  Œ[Ru]- Ry) 
S= (Re) + oho (5.16) 


- The Treynor ratio is simply equal to: 


RS 





(Ru — Rp). (5.15) 





























E|Rp|—Ry _ ap 
Bp Bp 


- If the portfolio is well-diversified, we have: 





RT = 











+ (E[Rm]-— Rf). (5.17) 


RT 
p (Rp, Rm) ~ 1. Thus: Gp ~ ae Then: RS ~ —. 
OM OM 


The following example illustrates properties of the three performance mea- 
sures. 


Example 5.1 
Suppose that the fund manager has private information about a given firm. 
The portfolio denoted by A contains only this stock. We have: 


Ra =a4 + Rf + pa (Rm — Ry) +ea. 


We assume that this information is sufficiently relevant such that: 























E[Ra] = aa + Rp + Ba |E[Rum]-Rş]. 
(4.9 %) (0.5%) (2%) (0.687) \ (5.49 %) 





The total risk is assumed to be equal to 6% and the specific risk equal to 
4.72%. These risks are assumed to not be modified by the information. 

A second fund manager has a well-diversified portfolio B due to the “good” 
knowlegde of many stocks with total risk equal to 8.71% and the specific risk 
equal to 4.73%. We have: 


Rg =asg + Rf + pg (Rm — Rf) + EB. 


136 Portfolio Optimization and Performance Analysis 


with: 


























(7.24 %) (0.5%) (2%) (1.359) \ (5.49 %) 


E[Re])= ap + Rf + ØB (Etey|- 


| 


(5.18) 


The next figure illustrates these assumptions. Points A’ and B’ correspond 


to portfolios A and B at equilibrium (i.e., with 0 alpha). 


Mean-return 





B, 
Security market line B i 
Ru A Marke | 
Beta: 
0.7 1 1.4 


FIGURE 5.2: SML and portfolios A, B, A’ and B’ 


The following figure represents the efficient frontier without riskless asset 


and the CML. 
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FIGURE 5.3: Capital market line 
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The well-diversified fund B’ is assumed to be efficient, while the fund A’, 
which contains only one stock, is not on this frontier. Using the alpha measure, 
funds A and B cannot be distinguished since both fund managers have the 
same excess return w.r.t. the CAPM (equal to 0.5%). For this special case, 
the Jensen measure does not allow for the comparison of these two portfolios, 
having too different risks. 

As mentioned by Modigliani and Pogue [392], the Treynor ratio allows us 
to reduce the bias due to the leverage effect which is contained in the Jensen 
alpha. Indeed, fund A has an excess return per systematic risk unit which 
is higher than fund B. Assuming that we can borrow the riskless asset, the 
leverage effect allows us to reach A* (from A), which has the same beta as 
fund B but a higher mean return. The following figure shows A, A*, and B. 
Funds A and A* have same Treynor ratio (equal to 4.22%) which is higher 
than fund B (3.86%). 


Mean-return 


Standard deviation 





1 
FIGURE 5.4: Capital market line and leverage effect 


The Sharpe ratio applied to funds A and B allows us to take account of 
the diversification. The Sharpe ratio of fund A is equal to 0.4833, while for 
fund B it is equal to 0.6025. Therefore, the manager who chooses the more 
diversified portfolio has a higher performance ranking. 

I 


PROPOSITION 5.2 
(A condition ensuring the same ranking by Sharpe and Treynor ratios) 


Consider two funds A and B. Their rankings w.r.t. Sharpe and Treynor 
ratios are equivalent if they have the same correlation coefficient with the 
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market portfolio: 


(RT 4 > RTg — RS4 > RSB) if PAM = PBM. (5.19) 


PROOF We have: 
Ra—Ry s Rg — Ry 
Ba Be ° 
Ra—R Rp-R 
om Ra fs) 7m RB f a, BSa ESB 
PAM OA PBM OB PAM PBM 


RT, > RTg 4> 


REMARK 5.3 If RT, > RTp and pam > pgm, then RS, > RSp. 
Otherwise, no general comparison is possible. Using the market model, we 
have: 


2 


o 

2 _ 222 2 2_ 2 2 2 2 EA 

oá = baam toi, —> 04 = pama toi, — Pam =1- 3: (5.20) 
A 





Thus, the higher the correlation coefficient, the smaller the specific risk for a 
given total risk. Then, if fund A has a higher Treynor ratio than fund B and 
a smaller (specific risk)/(total risk) ratio, then fund A has a higher Sharpe 
ratio than fund B. r 


PROPOSITION 5.3 
(A condition ensuring the same ranking by Sharpe and Jensen measures) 
Ifo, <op and pam > ppm, then: 


aa >ap => RS4> RSB. (5.21) 


PROOF The Sharpe ratios are given by: 




















i [Ru] — Ry). 








Paua a PA aie ea a a 
OA OA OB OB 


Therefore: 


oO 
RS, > RSg = aga- a > RSm (PBM — PAM) CA. 


Using the assumption o4 < og and pam > ppm, we deduce: 
aa > apg 4> RSa > RSB. 
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REMARK 5.4 The Sharpe ratio does not modify the Treynor ratio rank- 
ing RT, > RTs if pam > pgm. In order to get such general relation between 
the Sharpe ratio and the Jensen measure, we have to introduce an additional 
condition on the total risk: oA < op. 


5.1.2.5 Roll’s criticism 


According to Roll [430]), it is very difficult to determine the true market 
portfolio. Moreover, the market portfolio has to be mean-variance efficient. 
But, the true market portfolio cannot be observed since it must contain all 
the risky assets, even those which are not traded. Since a stock exchange 
index has to be substituted, the empirical studies depend on this choice. The 
CAPM is validated if this index is mean-variance efficient. 

Therefore, Roll’s criticism concerns performance measures based on the 
CAPM, such as the Treynor ratio and the Jensen measure. Using a portfolio 
which is not the true market portfolio may lead to estimation errors in the 
betas. However, the measurement errors can be corrected, as proposed by 
Shanken [461]. The Sharpe ratio avoids this problem, since it is based on the 
total risk instead of the beta. However, if we compare the Sharpe ratio of 
a given portfolio to the Sharpe ratio of an index, this comparison obviously 
depend on the choice of index. In addition, some statistical problems may 
also appear when estimating the Sharpe ratio. 


REMARK 5.5 What is the precision of Sharpe ratio estimation when us- 
ing financial data? Lo [359] analyzes the statistical distribution of the Sharpe 
ratio, assuming different properties on the financial process which generates 
the data. If portfolio returns are supposed to be independent and identically 
distributed (iid) then, from central limit theorem, the asymptotic distribution 
of the Sharpe ratio is such that: 


VT (R3 - RS) en (0,14 588?) 


Table 5.1 indicates some standard deviation values of the asymptotic Sharpe 
ratio (SE (RS) © y (1 + $RS?) /T). 


For example, for 60 observations, if the true value of the Sharpe ratio is 
equal to 1.50, then the standard deviation of the Sharpe ratio estimator is 
equal to 0.188. If the true value of the Sharpe ratio is equal to 3, then the 
standard deviation of the Sharpe ratio estimator is equal to 0.303. Thus, for 
instance, the hedge funds, which search for high Sharpe ratios, often have a 
higher standard deviation of the Sharpe ratio estimator than usual funds. 
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TABLE 5.1: Asymptotic standard deviation of the 
Sharpe ratio estimator 


Number T of observations 


[| Number T of observations] 
[Sharpe ratio | a Tt | eT | oo [ao 





However, most of the time, portfolio returns are not iid, such as hedge 
funds. Autocorrelations are high. Lo [359] shows that neglecting this feature 
may lead to overestimation of the Sharpe ratio (65% more for example), and 
that specific estimators must be used to avoid this problem. 


5.1.3 Other performance measures 
5.1.3.1 The information ratio 


5.1.3.1.1 Information ratio definition As seen in Chapter 4, many 
funds are based on benchmark optimization. Thus, such funds must be evalu- 
ated by taking account of this feature. Recall that the set of optimal portfolios 
is another frontier, called the relative frontier. For all efficient portfolios of 
the standard mean-variance frontier, the Sharpe ratio is constant. For opti- 
mal portfolios relative to a benchmark, the new risk measure is usually the 
tracking error volatility. Then, a new performance measure adjusted to the 
relative risk can be introduced. 


DEFINITION 5.1 The information ratio is defined by: 


RI = sea) (5.22) 


where T = o (Rp — Rp) is the tracking-error volatility. An ex-post ratio can 
also be introduced: 


oy _ Re-Rp 
RI = ——_—_.. 5.23 
ao (Rp — Rp) 629) 
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PROPOSITION 5.4 

As proved in Bertrand et al. [61], all portfolios belonging to the relative fron- 
tier (except the benchmark itself) have the same information ratio. Moreover, 
this ratio does not depend on the benchmark. 


PROOF Using relation T = Dy ø? — 06 (see Chapter 2), we deduce: 


(Rp—Rp) _ 2 — Rp) _ Ri- ko 
Dyo? — o6 - (ReRe) Jo? — og Bete) Joop VT 
Therefore, the information ratio of any portfolio of the relative frontier de- 


pends only on characteristics of portfolios 0 and 1 (which belong to the mean- 
variance frontier). 
[ 


RI = = cste. (5.24) 


REMARK 5.6 Dueto previous properties, the information ratio is adapt- 
ed to measure performance of benchmark funds. Consider portfolios of iso- 
beta curves in Chapter 4. Their information ratios depend on beta: 


- If 8 = 1, portfolios of the iso-beta frontier (except the benchmark), have 
the same information ratio absolute value. 


- If 8 Æ 1, portfolios of the iso-beta frontier have different information ratios. 


This is a drawback since, for a given beta, the portfolio which has the higher 
information ratio corresponds to an infinite standard deviation. 

On the contrary, portfolios belonging to a same ¢-frontier (see mean-variance 
optimization with tracking error constraint in Chapter 4) have the same infor- 
mation ratio, whatever the value of the aversion to variance ¢. With respect 
to the relative performance, they are indistinguishable. 

[ 


5.1.3.2 Information ratio and statistical test of excess performance 


The information ratio can be viewed as a statistical test of the excess per- 
formance w.r.t. the benchmark. Assume that the monthly portfolio and 
benchmark returns are iid. The excess return is a random variable ER, with 
expectation u and unknown standard deviation. 

Let ER = Rp — Rpg be the empirical mean, and s = ø (Rp — Rpg) the 
empirical standard deviation of the excess return, estimated from a sample of 
n observations. Then, we can consider the following univariate test: 


ee <0, 


Hy:p>0. (5:23) 
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Under the null hypothesis, te = yn (ER — 0) /s = \/n.RI. The t-statistic 
p= ER-p 
s/J/n 
ratio corresponds to the statistical test (divided by the square root of the 
number of observations) of the null hypothesis, which corresponds to the case 
of no overperformance of the portfolio w.r.t. the benchmark. 





has a (n — 1)—Student distribution. Therefore, the information 


Assume, for example, that n is equal to 60 (5 years of monthly data). Then, 
to reject hypothesis Ho for the threshold 5%, it is necessary that te > 1.671. 
The monthly information ratio must be higher than 0.2157 (which corresponds 
to 0.747 for one year). 


Suppose also that this information ratio corresponds to a mean ratio with 
management cost equal to 0.36. In order to reject hypothesis u < 0 and to 
conclude that the fund is performant with a confidence level equal to 95%, 
we must have a n = 1.645. This corresponds to n = 250.5 months, or 
equivalently to about 21 years. Thus, this result shows that a long period 
of performance persistence is needed to conclude that the better performance 
w.r.t. the benchmark is significant. 


5.1.3.3 Adjusted measures 


5.1.3.3.1 The Sortino ratio Indicators based on standard deviations do 
not indicate whether the value is above or below the mean. Risk-adjusted 
measures based, for example, on semi-variance can separate these two cases. 
Moreover, they can better take account of asymmetrical return distributions. 
This is the purpose of the Sortino ratio, introduced in Sortino and Price [477]. 
It is defined similar to the Sharpe ratio, but with a minimum acceptable return 
(MAR) instead of the riskless return. Moreover, the standard deviation is 
replaced by the semi-standard deviation of the return below the MAR. 


Rp — MAR 
Sortino ratiop=——> (5.26) 


HEL (Re - MaR) 


5.1.3.3.2 The Morningstar rating system The Morningstar rating is 
also called the risk-adjusted rating (RAR). It is used in particular in the 
USA. It is based on a system of stars, and is devoted to the ranking of funds 
belonging to the same peer group. It is calculated as the difference between 
relative returns and relative risks: 


RARp = RRp n RRiskp. (5.27) 


Relative return and risk are defined as follows: 


Rp ; Riskp 
P= BR SAE = BRISG. 


; (5.28) 
G 
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where P is the portfolio belonging to the peer group G, such as domestic 
stock funds, international stock funds, taxable bond funds, and tax-exempt 
municipal bond funds as well as equity funds classified by style: capitalization 
(large-cap, mid-cap, small cap) or growth/value. 


Rp is the return of fund P in excess of the risk-free rate, and Riskp is the 
risk of fund P. 


BRg denotes the base which is used to calculate the relative returns of 
funds in the group G, and BRiskg is the base which is used to calculate the 
relative risks of funds in the group. 


The risk of the fund can be measured by first determining the average of 
the negative values of the fund’s monthly returns in excess of the short-term 
riskless rate, then by setting: 


T 
1 
Riskp = -7 X min [Rp,,0], (5.29) 
t=0 


where T is the number of months. 


Risk can also be measured by taking account of both downside risk and up- 
side volatility, as well as downward volatility. Then, such criterion penalizes 
funds with highly volatile returns (both upside and downside). Indeed, ex- 
treme gains (upside volatility) may be associated to potential extreme losses 
(such as Internet funds in the early 2000s). Thus, this kind of performance 
measure reduces the potential interest in high-risk funds. 


The base is calculated by taking the average return of the n funds contained 
in the group. It can be compared with the riskless rate Ry: 


1 n 
BRg = = . 
Rg = max (2 2 Rr Rr) (5.30) 


The base for the relative risk is defined as the average of the funds in the 
peer group: 
1 n 
BRiskg = — Riskp,. 5.31 
iska = — 2 iskp, ( ) 
Sharpe [467] examines properties and limitations of this measure, which is 
not appropriate for measuring the risk of funds held over a long period. 
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5.1.3.3.3 The Dowd ratio The performance analysis can be also based 
on other risk measures, such as the Value-at-Risk measure (see Chapter 2). 
For example, we can consider a Sharpe-type ratio in which the standard de- 
viation is replaced with a risk measured based on the VaR: 


Ree 


— 5.32 
VaRp/Vpo’ ( ) 


where VaRp denotes the VaR of portfolio P, and Vp o is the initial value of 
this portfolio. 


Dowd [170] introduces a measure of investment process based on the VaR. 


Consider an investor who holds a portfolio which can be modified by intro- 
ducing a new asset. To analyze the potential benefit to introduce this asset 
in the portfolio, the investor defines the risk in terms of the increase in the 
portfolio’s VaR. The asset will be included if the excess return that it gives is 
not compensated by a too high incremental VaR. 


Assuming for example that portfolio returns are normally distributed, the 
VaR. of the portfolio P is proportional to the standard deviation: 


VaR = —ao(Rp)N, (5.33) 


where a denotes the confidence parameter for which the VaR is estimated, 
o(Rp) is the standard deviation of the portfolio returns, and N is a parameter 
linked to the size of the portfolio. Then, asset A will be included in the 
portfolio with weight a if and only if: 


oy 2 eee 
Ra > Be + E ( 





VaRp, 1) 


34 
VaRp ea ) 


where P4 denotes the new portfolio with asset A. Denote the incremental 
VaR. by: 
IVaR = VaRp, — VaRp. 


Define the function ya by: 


1 IVaR 
a VaRp ‘ 





ya(VaR) = 


The function y4 is the percentage increase in the VaR due to the purchase of 
asset A. Then, asset A will be included in the portfolio with weight a if and 
only if: 

Ra > (1+ ya(VaR)) Rp. (5.35) 
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5.1.4 Beyond the CAPM 


Criticisms of the CAPM have led to the introduction of more sophisticated 
statistical models, such as Arch models. The volatility may depend on past 
asset values, or the coefficient beta may evolve according to given factors. 


5.1.4.1 Performance measurement using a conditional beta 


Amenc and Lesourd [22] propose to introduce ARCH modelling for identi- 
fied factors which influence asset return dynamics (not for the assets them- 
selves, since the number of assets is much higher than the number of factors). 


Denote by n the number of assets and by R; the vector of the returns at time 
t. Consider the conditional expectation E |R; |F:—-1] and variance-covariance 
matrix V |R: |F:—1] of the vector of returns. Assume that these terms are 
given by the following equation: 














Ri = Bf + ut, (5.36) 


with 














` [ut |Fı 1] = 0, 
V [R| F1] = 0°. (5.37) 


The factor f; satisfies 
fe= wt et, (5.38) 


with (€,), and (uz)¢ independent and 











let |Fz-1] = 0, 





q p 
Viet |Fi-a] = c+ X mea tY bihi. (5.39) 
i=1 j=1 


Thus 











l [Ri |Ft—-1] = 66 where ô is a scalar. 





In addition, the variance-covariance matrix of the n assets can be written 
as follows: 


B? -Biba o? 0...0 
Yy [Ri |Fi_1 | = oe Yy [ex |\Fi_1| + 0... , (5.40) 
i Bn--Be 0...0 8 o2 


Therefore, for a portfolio with weights (w1, ..., Wn), we have: 


Rp, = 5 wi tRi t = 5 wit (Bifit + uit). (5.41) 
i=1 i=1 
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Set Get = 2 Witi and upt = 2. Wi tuit. Then, the Jensen measure is 
i=l 
estimated from the relation 


Rp, = Qp + Behe + UP,t- (5.42) 


The coefficients ap and Gp, are estimated by using regression from a time 
series on the portfolio and the market. 


5.1.4.2 Performance measurement using a conditional beta 


A conditional version of the CAPM can be introduced, as in Ferson and 
Schadt [222]. The conditional CAPM is such that, for any asset i: 


Rings — Rpt = Bim (Rugs — Ree) + titt, (5.43) 


with 











i [uie+1 |F+] = 0, 
E [ui t41 (Rm1 — Ret) [Fe] = 0. (5.44) 














The filtration F; represents the public information available at time t. The 
beta i,m, of the regression is a conditional beta which depends on the infor- 
mation F;, generated by a given set of factors. 





If only information F; is used, then the alpha term is null. The error term 
ui t+1 in the regression is independent from information F. Therefore, this 
model is coherent with the efficient market hypothesis. If F, is the c—algebra 
generated by a process (J;),, then, using Taylor’s approximation, the beta can 
be estimated through a linear function: 


Bp, m.t = bop + Bo(h — It), (5.45) 


where bo,p is an average beta which is equal to the unconditional mean of the 
conditional beta: bo p = E [8p m,t |F+]. The components of the vector Bp are 
the response coefficients of the conditional beta w.r.t. the random variable T4. 
Then, a conditional formulation of portfolio return can be deduced: 














Reig, — Ryo = bo,p (Russ — Rete) + Bp l (Rm1 — Regi) tures, 
(5.46) 
with 











E [upe+1|Fe] = 
i [upt41 (Rmn — Rfs) F] = 








0, 
0. 








(5.47) 





Thus, this is a stochastic factor model which is a linear function of the 
excess market return, the coefficients of which depend linearly on Jẹ. It is 
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a time-dependent generalization of the standard CAPM and can be used to 
examine, for example, the Jensen measure: 


RP ~ Req = 


acp + bo,p (Rmn — Reet) + Bp L (Rutyeg — Ry tei) tepti, (5-48) 


where agp is the average difference between the excess return of the portfolio 
and the excess return of a dynamic reference strategy. To improve the alpha 
forecast, Ferson and Schadt [222] assume a relationship between the portfolio 
risk and the market indicators: for example, the market index dividend yield 
(DY;), and the return on short-term T-bills (T’Bz)+. 


Denote dy; and tb; the random variables equal to the differentials compared 
to the average of the variables DY, and TB;: 





dy, = DY, — E[DY;], 
tb, = TB, -E[TB). (5.49) 

















Therefore, we have: I, — I, = bal and Bp = | . The conditional 
t 2P 


beta can be written as: 
BP = bo + bidyt + both. 
Then, the conditional formulation of the Jensen model is given by: 


Reais — Reap = acp + bo,p (Rm, — Rf) (5.50) 
+ (bidy: + botb:) (RM, z Ryt41) + Cpt, (5.51) 


where acp is the conditional performance measure, bo,p is the conditional 
beta, and bı and bə indicate the variations in the conditional beta compared 
with the dividend yield and the return on the T-bills. These coefficients are 
estimated through regression from the time series of the variables. 


5.1.4.3 Performance measurement independent of the market mod- 
el 


Due to Roll’s criticism, measures that do not depend on the market model 
have been developed, such as the Cornell measure and the Grinblatt and 
Titman measure. The goal of these measures is to evaluate the managers’s 
capacity to select stocks that have higher returns than the average. This 
average must be defined, which is one drawback of these measures. 


5.1.4.3.1 The Cornell measure The Cornell measure (see Cornell [130]) 
is defined as the average difference between the return on the investor’s port- 
folio during the management period, and the return of the reference portfolio 


148 Portfolio Optimization and Performance Analysis 


with the same weighting but observed on a different period than the investor’s 
management period. The calculation is made after the investment horizon. 


The Cornell measure can be written as: 
C = Rp — fp Rp, (5.52) 


where the symbol X p denotes the P-limit of * Soi Xpt. 
Therefore, we get: 


T 
ae 1 5 X 
C =P — limit of È = Bp: (Re. = Ro.) +p. (5.53) 





This means that C is the sum of the selectivity and timing components in 
the Jensen measure decomposition. Thus, if the investor has no particular 
skill in terms of timing or selectivity, Jensen and Cornell measures give a null 
performance. 


5.1.4.3.2 The Grinblatt and Titman measures Grinblatt and Tit- 
man [269], [270]) propose a measure to improve upon the Jensen measure by 
allowing us to better take account of market timing, without any information 
about portfolio weighting. This method assumes that, if a manager has a mar- 
ket timing skill, then his performance would be observed over several periods. 
The measure attributes a null performance to uninformed fund managers. It 
is defined as follows. Let wp; denote the weighting attributed to the return 
for period t. Consider the return Rg, of the reference portfolio for the period 
t. We have: Eai wp+(RsB«+— Rr) = 0. Then, a first Grinblatt and Titman 
measure is given by: 


Hi 
GB1 = X wp, (Ree — Rf). (5.54) 


t=1 


Therefore, a positive GB indicates that the fund manager has good fore- 
casts on market dynamics. However, this measure supposes that the portfolio 
weighting is well determined. 

For this reason, Grinblatt and Titman [271] introduce another measure 
which takes account of the portfolio’s composition evolution. This approach 
supposes that an informed fund manager modifies the portfolio weights ac- 
cording to his forecast on the market’s evolution. Thus, covariances between 
asset returns and their weights are not null. The measure involves these co- 


variances: 
n ois 


1 
GB2 = = SoS) (wie — wier) (Riz — Rez), (5.55) 


i=l t=1 


where w; and wi t—k are the weights of asset i at times t and t — k. 
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The expectation of this measure is null if the fund manager is not informed, 
and positive otherwise. No reference portfolio is used, but this method re- 
quires a large amount of data and calculation. 


5.1.4.4 Performance measurement and multi-factor models 


Generalizations of the CAPM can be based on multi-factor models, which 
are also linear models but do not make any assumption on the investor’s risk 
aversion. 


A first model, introduced by Ross [439], is based on a specific arbitrage 
valuation. It is called the Arbitrage Pricing Theory (APT). It does not assume 
normality of returns and supposes only that investors are risk-averse, without 
specifying a particular utility function. We have: 














Rie =E[R] + XC bin Fee + Eit (5.56) 

k=1 
where b; œ denotes the sensitivity of asset i to factor k, F,, denotes the return 
of factor k with E[F; 4] = 0, and e; denotes the residual return of asset i. It 
is the specific risk of asset i which is not explained by the factors and satisfies: 


y} [eit] =0. 


























e The APT model supposes that markets are perfectly efficient and that 
the factor model is the same for all investors. 


e The number n of assets is assumed to be very large w.r.t. the number 
K of factors. 


The residuals are independent from each other and independent from the 
factors: 


Cov(Ei t, €3,t) = 0, Vi £ J; 
Cov(éit, Fkt) = 0, Vi, Vk. (5.57) 


Arbitrage conditions lead to the existence of factor risk premia A, such 
that: 





K 
— Rp = So Arbis (5.58) 











Denote 6, the expected return of a portfolio with a sensitivity to factor k 
equal to 1, and null sensitivity to other factors. Then: 














K 
Ax = ôk — Ry and E =X (5k — Ry) biks (5.59) 
k=1 


where bik = one are the sensitivities to the factor loadings. 
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When there exists only one factor corresponding to the market return, this 
model is the CAPM. The problem is to identify the number of factors. Numer- 
ous empirical studies are devoted to the determination of the macroeconomic 
or financial factors. For example, the three-factor model of Fama and French 
[219] takes account of the book-to-market ratio and the company’s size mea- 
sured by its market capitalization: 


Rpt i Ret = Qp + Bp (Rut i Ryt) + bs. SMB, + by. HML, + EPt, (5.60) 


where SM B; indicates “small (cap) minus big.” It measures the excess return 
of the small-capitalization returns w.r.t. large-capitalization returns. 


The term HM L; is the “high (book/price) minus low.” It denotes the d- 
ifference between returns on portfolios with high book-to-market ratios and 
portfolios with low book-to-market ratios. 


Fama and French assume that the market is efficient, but that more than 
one factor is needed to explain asset returns. 


Carhart’s four factor model [105] introduces one additional factor: the 
PRIYR which denotes the difference between the average of the highest re- 
turns and the average of the lowest returns from the previous year. Thus, we 
have the following decomposition: 


Rei—Ryt = ap+(p (Rut = Ryt)+bs.SMBi+by.HML,+bp.PRIY Rite pr 

(5.61) 

The Barra multi-factor model (see Barra [42] and Scheikh [453]) supposes 

that asset returns are determined by the firm’s characteristics: size, earnings, 
industrial sectors, etc., which are the factors used jointly with risk indices. 


REMARK 5.7 As can be seen, the application of multi-factor model 
to performance measurement requires the choice of a specific model to take 
account of industrial and financial factors. Endogeneous factors can also be 
extracted through a factor analysis, either based on the principle of the maxi- 
mum likelihood, or based on principal component analysis. It also provides for 
factor loadings. However, in that case, these factors are not well-identified. 
Therefore, we can search for explanation of these factors from known indi- 
cators. Then, we can deduce an explicit decomposition from the implicit 
decomposition. | 
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5.2 Performance decomposition 


Since nowadays investors request more information about the investment 
process of fund managers, alternative performance measures are necessary to 
search for the performance causes. The performance attribution tries to de- 
compose the excess performance into identified terms by taking more account 
of the management process. 


5.2.1 The Fama decomposition 


Fama [218] proposes a performance decomposition which separates the fund 
performance into two parts: 


e The selectivity 
and 
e The risk. 


This analysis is made in the CAPM framework. A portfolio P is compared 
with a “naive” portfolio C, which consists of investing a weight x on the 
riskless asset, and (1 — x) on the market portfolio, such that the portfolio 
beta of C is equal to the beta of P : Bc = Gp. Then, the portfolio C is the 
efficient portfolio which has the same systematic risk as portfolio P and does 
not require a forecast ability. Then, we have: 


Ro= (1— 6p) Ry + BpRwu. (5.62) 


The portfolio P may be less diversified. This risk may allow for an excess 
performance w.r.t. portfolio C. 


Fama proposes the following decomposition: 


Total performance Selectivity Risk 
—_—_ A 
Rp-R; = [Rp - Ro] + [Re - Ry]. (5.63) 


The selectivity term measures the performance part due to the systematic 
risk, born by the fund manager. It is equal to the Jensenalpha. Portfolios P 
and C have the same systematic risk. However, their total risks are different. 
If portfolio P is the main investor’s wealth, then the total risk is the perti- 
nent risk measure. In that case, portfolio P must be compared with a naïve 
portfolio with the same total risk rather than the same sytematic risk. 


Thus, Fama proposes the following selectivity decomposition: 


Selectivity Diversification 


e ts = = 
[Rp — Rc] = Net Selectivity + [Rov — Re], (5.64) 
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where 
Selectivity Diversification 
SSS 
Net Selectivity = [Rp — Rc] — [Re — Ro]. (5.65) 


By definition, Ro is the return of a portfolio which contains the riskless asset 
and the market portfolio with the same total risk as portfolio P. The portfolio 
C’ is also efficient. If the net selectivity is negative, then the fund manager 
has a diversified risk which has not been compensated by excess return. The 
diversification term takes account of this additional return (i.e., o (Rp)— 8P). 
It is always positive. 


The risk Ro — R y can also be decomposed as follows: 


Risk Manager’s risk Investor’s risk 
[Ro -R| = Ro-Rr +[Rr—- Ry]. (5.66) 


The investor has a risk objective Gp which leads to a portfolio with return 
Rp on the security market line. The fund manager chooses a portfolio with 
risk Gp. Therefore, the performance component due to the total risk is due 
to the risk level fixed by the investor, and to the risk level fixed by the fund 
manager. 


Mean-return 
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FIGURE 5.5: FAMA performance decomposition 
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5.2.2 Other performance attributions 


Two kinds of performance attribution can be distinguished: 


e The external attribution, which uses exogeneous information w.r.t. the 
time series of the portfolio and benchmark returns. 


e The internal attribution, which uses the time series of the portfolio and 
benchmark weightings. 


5.2.3 The external attribution 


The following methods allow for the determination of the fund manager’s 
ability to forecast the global evolution of the market and to select assets well. 


5.2.3.1 Treynor-Mazuy 


In order to determine the quality of the forecast concerning the global evolu- 
tion of the market, Treynor and Mazuy[495] examine the beta evolution w.r.t. 
market evolution. Typically, a successful market timing strategy is associated 
to a beta which is higher than 1 when the market is bullish, and smaller than 
1 when the market is bearish. If the fund manager has no market timing 
goal, then Gp = cste. Then, without specific risk, points with components 
(Rut; Rpt) would be on the straight line with equation: Rp; = GpRyy. With 
specific risk, but still with Gp = cste, one would observe a set of points such 
as in the following figure. 


Rp 





FIGURE 5.6: Linear regression of Rp on Rm without market timing 
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However, as soon as the fund manager tries to forecast the market evo- 
lution and modifies his beta according to this, one would observe points in 
the previous figure above the regression line for various market return values 
Ruz. The set of points would be adjusted w.r.t. a convex curve, if the market 
timing strategy is successful, as shown in Figure (5.7). 


Rp 





FIGURE 5.7: Linear regression of Rp on Rm with successful market 
timing 


Treynor and Mazuy propose a statistical method based on this property. 
More precisely, in order to allow for a convex relation between Rm and Rpr, 
they introduce the following equation: 


Rp; — Rf = ap + bp (Rui — Rpt) + ôp (Ru — Rpt)? +er:. (5.67) 


If coefficient dp is significantly different from zero, then we can conclude that 
the fund manager has used a market-timing strategy (with success if dp > 0 
and failure if dp < 0). 


Moreover, the alpha coefficient is still viewed as the fund manager’s ability 
to select the assets with returns above the values given by the CAPM. 
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5.2.3.2 Henrickson-Merton 


Henrickson and Merton [290] assume that the fund manager can forecast a 
comparison between Rm and Rey: Rmi > Rpt or Rue < Rey. This assump- 
tion leads to the following regression model: 


Rpt = Ret = Qp + Bp (Rmt = Ryt) + ôpD (Rmt = Ryt) + EPt, (5.68) 


with D=1 if Rm > Rfi, and D=O0if Rm < Rfi- 

Then, the beta can take two values: Gp + dp (resp. Bp) is the portfolio beta 
when the market return is higher (resp. smaller) than the riskless return. A 
market-timing strategy is successful if dp is significantly positive. The fund 
manager’s ability to select assets is determined from the Jensen alpha. 


5.2.4 The internal attribution 


The purpose of this performance attribution is to explain the excess perfor- 
mance w.r.t. a benchmark, jointly determined by the investor and the fund 
manager. As seen in Chapter 4, the management process can be divided in- 
to the asset allocations and the tactical strategy based on stock picking and 
market timing. 


5.2.4.1 The Brinson model 


Brinson et al. [90] propose the following method: consider a portfolio P 
and a benchmark B which contain n asset classes i = 1,...,n. 

Denote: 

- U the set of assets, l = 1,...,.m = #(U), {U1,...,Un} is a partition of the 
set U and is the set of the n asset classes i = 1,...,.n < m; 

- Zı the return of asset l (vector: Z). This is a random variable if we 
consider ex-ante values. This is a scalar if ex-post data are analyzed; 

- R; the expected return of asset l (vector : R); 

- on the covariance between assets k and | (matrix : V); 

- Zpt (resp. zy) the weight on asset l in the portfolio (resp. the benchmark) 
(vector: zp and zp); 








- Wpi = YS Zp (resp. wy = > zu) the weight of asset class i in the 
lEU; LEU; 
portfolio (resp. the benchmark); 
- Dp = ae z and wa = >> are the weights of asset / into the asset 
P 
1eU; IEU; 


class ¿ of the portfolio and the benchmark; 
- Zpi = YS Wp-Zı (resp. Zvi = Y> Wy-Z1) is the return of asset class i in 
lEUi lEUi 
the portfolio (resp. the benchmark); and, 
- Rp = YS wp-Rı (resp. Roi = > wa.-Rı) is the mean return of asset 
lEU; leU; 
class 7 in the portfolio (resp. in the benchmark). 
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Then, the excess global performance of the portfolio w.r.t. the benchmark 
is given by: 


n m 


S = Rp — Ry = X (wpi-Rpi — wei-Roi) = X (2p — 201) R (5.69) 


i=l l=1 
This excess performance has to be decomposed into main management pro- 


cesses: asset allocation and asset selection. 


5.2.4.1.1 Asset allocation effect When searching for the origin of ex- 
cess performance, we encounter a problem due to the allocation effect for one 
particular asset class. Indeed, the high weighting (resp. the low weighting) of 
an asset class leads to the low or high weighting of at least one another class. 

One would expect to consider the following measure for asset class i: 
(Wpi = Whi) Rbi. (5.70) 


However the following example shows that this measure is not well adapted. 


Example 5.2 


TABLE 5.2: Asset allocation (percentages) 


Class Wpi Woi Ro; Wpi — Whi Ro; 
1 25 15 8 0.8 
2 40 45 10 —0.5 
3 35 40 12 —0.6 
Total 100 100 Ry, = 10.50 —0.3 


The last column leads to the conclusion that the high weighting of asset 
class 1 is judicious, and that the relatively bad global performance of the 
portfolio is due to the low weighting of asset classes 2 and 3. Nevertheless, we 
can see that the fund manager has highly weighted an asset class which has 
a return smaller than the mean return of the benchmark (8% versus 10.5%). 


This example proves that we must take into account the difference between 
the class return and the mean benchmark return. Therefore, the contribution 
of asset class i to the excess performance is defined by: 


AA; = (Wpi = Wi) (Roi = Ry) . (5.71) 
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The following table indicates the different cases: 


TABLE 5.3: Contribution to asset classes 


Excess performance Bad performance 


class 2: class 2: 
(Roi = Ry) >0 (Roi = Ry) <0 
High weighting Good decision Bad decision 
class 2: (Wpi = Whi) . (Wpi = Wi) . 
(Wpi = Whi) >0 (Roi = Ry) >0 (Roi = Ry) <0 
Low weighting Bad decision Good decision 
class 2: (Wpi = Whi) x (Wpi = Wi) . 


(Wpi = Wi) <0 (Roi E Ry) <0 (Roi = Ry) >0 


Applying this method to the previous example, we get the following table: 


TABLE 5.4: Asset selection effects 
Class Wpi Wei Rii (wp: — wbi) (Roi — Ro) 


1 25 15 8 —0.25 
2 40 45 10 0.03 
3 35 40 12 —0.08 
Total 100 100 Ry, = 10.50 —0.30 


Note that the bad performance of the portfolio w.r.t. the benchmark is 
mainly due to the high weighting of asset class 1, which is rather intuitive. 
Additionally, the global bad performance is the same as previously (—0.3%). 
This is due to: 


n n 
> ( (Wpi — wi) Ry = 0 since J Wpi = > Wpi = 1. 
i=1 


i=l i=l 


5.2.4.1.2 Asset selection effect The contribution to the global excess 
return of asset selection within each asset class is given by: 


SE; = Whi (Rpi = Roi) $ (5.72) 


The choice of the benchmark weight for asset class 7 is justified in order to 
not interfere with the allocation effect. The difference (Rp; — Rbi) is not null 
as soon as the fund manager’s weighting of the assets included in the asset 
class i is different from the benchmark’s. Therefore, this allows us to measure 
the selection effect. 
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5.2.4.1.3 The interaction term Since the sum of the two previous ef- 
fects is not equal to the global excess performance of asset class 7, an additional 
term must be introduced: 


L = (Wpi — Wei) (Rpi — Roi) - (5.73) 
Then we deduce: 
i=1 i=1 


Example 5.3 

Consider a fund with benchmark: 35% domestic stocks; 50% domestic bonds; 
and 15% international stocks. Suppose that the management period corre- 
sponds to one year, and that the weighting of asset classes has been determined 
through a strategical asset allocation which is not modified during the given 
period. The numerical values are given in the following tables: 


TABLE 5.5: Portfolio characteristics 


Return Return 
Portfolio Benchmark class 7: class 7 : 
weights weights portfolio benchmark 
Class Wpi Whi Ropi Roi 
Stock (do.) 40 35 13.0 12.0 
Bond (do.) 40 50 6.75 7.0 
Stock (int.) 20 15 11.0 11.0 
Total 100 100 Rb = 9.35 


The measures of the three effects are given by: 


TABLE 5.6: Performance attribution 


Measure of Measure of Measure of 


Allocation selection interaction 
effect effect effect 
Stocks (do.) 0.13 0.35 0.05 
Bonds (do.) 0.24 —0.13 0.03 
Stocks (Int.) 0.08 0.00 0.00 
Total 0.45 0.23 0.075 


The benchmark and the portfolio have returns respectively equal to 9.35% 
and 10.10%. This excess performance is decomposed into 0.45% for the al- 
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location effect, 0.23% for the asset selection and 0.075% for the interaction 
effect. 


5.2.4.2 Limit of attribution method 


The Brinson method does not sufficiently take into account the risk of fi- 
nancial investments. Consider, for instance, a fund manager who chooses a 
portfolio according to the Markowitz criterion applied on the tracking error 
(see Chapter 4 and Roll [431)). 


Suppose that the benchmark contains three asset classes, each with two 
assets. Suppose that this benchmark is not efficient. 


The minimization of the tracking error is illustrated by the following figure, 
where the benchmark (resp. the managed portfolio 1) has a mean-return equal 
to 8.60% (resp. 9.60%) and a standard deviation 9.30% (resp. 10.57%). The 
tracking error volatility is equal to 2.24%. 
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FIGURE 5.8: Efficient and relative frontiers 


Despite the optimal choice of portfolio 1, the performance attribution is not 
necessarily good. The following table indicates the performance attribution 
process. 
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TABLE 5.7: Performance attribution of portfolio 1 


Portfolio Portfolio | Allocation Selection Interaction 

weighting weighting Effect Effect Effect 
65.80 40.00 0.232 —0.027 —0.017 
14.73 40.00 0.278 0.012 —0.007 
19.47 20.00 —0.002 0.546 —0.014 


100.00 100.00 0.508 0.531 —0.039 





The excess portfolio return w.r.t. the benchmark is equal to 1.00%: 0.5081% 
is due to the allocation process, 0.5308% is due to the asset selection and 
—0.0389% is due to the interaction effect. 

Whereas the portfolio is optimal, some of the performance indicators are 
negative. For example, the low weighting of asset class 3 leads to a negative 
contribution equal to —0.0002% of the global portfolio performance. However, 
these results are not too opposed to the portfolio optimality since performance 
attribution does not take account of the tracking error itself. 


5.2.4.2.1 The risk attribution To keep the coherence with the bench- 
mark optimization, Bertrand [56] proposes a performance attribution method 
based on the decomposition of the tracking-error volatility. 


In particular, we can note that information ratios for each decision process 
are equal (asset allocation, asset selection, and interaction effect). 


The tracking-error volatility is given by: 
T?  (zp— 2) V(z—2) _ Cov(S,S) 


T Ao- %)' V (% — %) £ 


1 m m 
= pow Ea (zpi — zu) Zi, Des i (zpi — Zp1) Z1) ) 


T= 





1 n 
S Cov (£ (Wpi — Wei) (Zoi — Zo), s) 
+Cov (>: Whi (Zoi — Zoi) , s) 
i=l 
+ Cov (>: (Wpi — Wei) (Zpi — Zoi) , s)| ; 
izi 


= > [Cov (AA, S) + Cov (SE, S) + Cov (I, $)]. 
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From the previous relation, the tracking-error volatility T is decomposed 
into three terms: 


e The contribution to the total risk of the asset allocation, which is mea- 
sured by Cov (AA, S) /T; 


e The contribution to the relative risk of the asset selection, which is 
measured by Cov (SE, S) /T; 


e The contribution to the relative risk of the interaction effect, which is 
measured by Cov (I, S) /T. 


PROPOSITION 5.5 
According to Bertrand [56]), for any portfolio belonging to the relative frontier, 
we have: 


e Each term in the risk attribution decomposition has the same sign as the 
corresponding term in the perfomance attribution decomposition. Addi- 
tionally, it is equal to the component of performance attribution divided 
by the information ratio: 


Cov (AAi, S)  (Wpi — wei) (Roi — Ro) 


T RIp 
Cov (SE;,S) wei (Roi — Roi) 
T E RIp i 
Cov (Ii, S) _ (Wpi — wii) (Rpi — Rei) 
T RIp 


e Each term in the risk attribution decomposition has the same informa- 
tion ratio as the information ratio of the portfolio RIp: 


RI(AA;) = (tei — tri) (Roi — Ro) 


Couv(AAj,S) = klp, 
=== 
wei (Roi — Roi) 
RI (SE;) = —ConsEesy RIp, 
Cols EnS) 
(Wpi — Woi) (Rpi — Roi) 
RI (Ii) T a AEEA RA = RIp. 
Geiss) 


The previous results show that for portfolios belonging to the relative fron- 
tier, the appropriate risk attribution measure is the tracking-error volatility 
of each term of the decomposition. 
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The following table indicates the tracking-error volatility of each component 
of the performance attribution process proposed in the previous example. 


TABLE 5.8: Tracking-error volatilities 


Cov(AA;,S) Cov(SE;,5) Cov(li, S) Total 


Asset class 1 0.5205 —0.0596 —0.0385 0.422 
Asset class 2 0.6231 0.0258 —0.0163 0.633 
Asset class 3 —0.0047 1.2234 —0.0323 = 1.186 

‘Total 1.1388 1.1896 —0.0871 2.241 


The total tracking-error volatility of the portfolio is equal to 2.241%. Note 
that each decision which leads to a bad performance (for example, the low 
weighting of asset class 3 or the weighting into asset class 1) is now clearly 
identified as a decision which contributes to the reduction of the relative risk 
and thus is justified. 


The constance of information ratio is illustrated by the following table: 


TABLE 5.9: Information ratios 


RI(AA;) RI(SE;) RIG; 
Asset class 1 0.44617 0.44617 0.44617 
Asset class 2 0.44617 0.44617 0.44617 
Asset class 3 0.44617 0.44617 0.44617 


Note that Grinold and Kahn [273] consider that such a value for an infor- 
mation ratio (=0.44617) is a good indicator for the active fund manager. 
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5.3 Further Reading 


General results about portfolio performance are presented in Grinold and 
Kahn [273], and Amenc and Le Sourd [22]. 


Performance measures taking management style into account are intro- 
duced in Sharpe [465], and in Lobosco [360], who proposes the SRAP measure 
(style/risk-adjusted performance). Muralidhar [396] introduces a specific mea- 
sure to compare performance of different managers within funds with the same 
objectives (so belonging to the same peer group). International diversifica- 
tion can be taken into account by using the IAPM (international asset pricing 
model) introduced by Solnik [472]. 


The problem of performance persistence is related to market efficiency. 
However, from the professional point of view, investment performances for 
individual fund managers are examined (skillful or lucky?) rather than the 
global market inefficiency. As mentioned in Kahn and Rudd [313], the earli- 
est empirical studies seem to suggest no performance persistence while recent 
articles conclude that a certain level of performance persistence exists. 


According to Brown et al. [93], short-term performance is persistent, but 
the survivorship bias affects the results (i.e., bad funds tend to disappear), 
since it is in favor with performance persistence (see also [93]). Jegadesh 
and Titman [299] examine HYSE and AMEX securities over the period 1965- 
1989. They show that a momentum strategy, which is based on buying the 
best funds and selling the worst ones from the previous six months, provides 
a 1% per month excess return over the following six months. 


The difference in results concerning the performance persistence can be due 
to seasonal or daily effects. Note also that stock markets are subject to cycles. 
Thus, a given management process may be good for a given cycle and bad for 
another. 


Other recent methods involving new risk measures have been introduced to 
study risk attribution and portfolio performance. The total risk of a portfolio 
can be decomposed into terms that can be interpreted as the risk contribution 
of the corresponding subsets of the portfolio. An overview of such methodol- 
ogy is provided in Rachev and Zhang [422]. 


Part III 


Dynamic portfolio 
optimization 


“Finance is a highly analytical subject, and nowhere more so than in 
continuous-time analysis. Indeed, the mathematics of the continuous-time 
finance model contains some of the most beautiful applications of probability 
and optimization theory. But, of course, not all that is beautiful in science 
need also be practical. And surely, not all that is practical in science is beau- 
tiful. Here we have both. With all its seemingly abstruse mathematics, the 
continuous-time model has nevertheless found its way into the mainstream of 
finance practice. Perhaps its most visible influence on practice has been in the 
pricing and hedging of financial instruments, an area that has experienced an 
explosion of real-world innovations over the last decade. In fact, much of the 
applied research on using the continuous-time model in this area now takes 
place within practicing financial institutions.” 


Robert Merton, “Continuous-Time Finance,” Blackwell Publishers, (1990). 
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Portfolio optimization is said to be “myopic” when the investor does not 
know what will happen beyond the immediate next period. In this framework, 
basic results about one-period portfolio optimization, such as mean-variance 
analysis, were described in Part II. Such an approach can be justified for 
short-term horizons without portfolio rebalancing, or for special utility func- 
tions such as power utilities (portfolio choice is myopic when the relative risk 
aversion is constant and returns are iid). 


However, for long-term investment, the investor can benefit from dynamic 
portfolio optimization, which allows him to take account of important oppor- 
tunities: 


e The investor can modify the portfolio weighting along the whole man- 
agement period, contrary to a “buy and hold” strategy. Therefore, for 
example, the portfolio value at maturity can be a quite general function 
of financial indexes (no longer necessarily linear). 


e The investor can use the information delivered at any time by observing 
financial or economic indicators. In particular, the portfolio strategy 
can take into account variations of main financial indices. 


Despite the unrealistic assumption that a portfolio is actually rebalanced 
in continuous-time, this approximation can be justified by several reasons: 


e First, looking at rebalancing times during a time period [0,7] (for ex- 
ample, intraday market), we observe that they do not correspond to the 
same deterministic moments, but look like marked point process with 
values in the whole time interval [0, T]. 


e Second, when examining financial market properties, we can fix a time 
scaling so that derivatives hedging and pricing seem to be in continuous- 
time, and perhaps choose another time scaling in order to assume that 
portfolio strategies are in continuous-time. 


e Finally, under some mild assumptions, discrete-time financial models 

converge to continuous-time ones, in particular, optimal portfolio strate- 
gies (see, e.g., Prigent [414]). 
Indeed, continuous-time modelling leads to more complexity than one- 
period modelling, requiring introduction of stochastic processes, Ito lem- 
ma, martingales, etc. However, for standard models, dynamic complete- 
ness leads to significant simplification. For utility maximization, explicit 
solutions can be deduced and analyzed, while, for an one-period incom- 
plete financial market, this analysis may be more involved. 
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This part is devoted to continuous-time optimization: 
e Chapter 6 provides a “brief? summary of standard dynamic optimiza- 


tion. The two main approaches are illustrated by basic examples: 


- The first one is based on the dynamic programming method, using the 
Pontryagin and Bellman principles. 
- The second one is based on martingale methods and duality. 

e Chapter 7 is devoted to the search of optimal payoff profiles and long- 


term portfolio management: 


- Search of an optimal portfolio value, which is assumed to be a function 
of a given security price, for example a financial index. 


- Determination of a long-term portfolio, which is composed of three 
main assets: cash, bond, and stock. 


e Finally, Chapter 8 describes main results of financial portfolio optimiza- 
tion when “frictions” are considered: 


- Market incompleteness and/or convex constraints; 
- Transaction costs; and, 


- Other extensions, such as the existence of a labor income or a random 
time horizon. 


Chapter 6 


Dynamic programming optimization 


6.1 Control theory 


In what follows, we consider calculation of variations in the deterministic 
framework to introduce the main ideas of dynamic programming. 


6.1.1 Calculus of variations 
6.1.1.1 General problem of the calculus of variations 


Very often, this problem takes the following form: 


- Let [0, T] be the time period. 

- Let U be a function w.r.t. three variables (t, x, y) with real values, assumed 
to be continuously-differentiable on [0, T] x R? x R¢. 

- We search for a function x(.), continuously-differentiable on [0,7] x R? 
solution of the optimization problem (P): 


T x 
max U(x(.)) =| U(t, x(t), x(t))dt, 
under : x(0) = zo and 2(T) = zr, (6.1) 


where z(t) denotes the derivative of x(.) w.r.t. the current time, and zo and 
xp are given. 
In what follows, the general results are illustrated by some basic examples. 


6.1.1.2 Standard consumption-saving problems 


Example 6.1 
The function z(.) denotes the wealth invested on a riskless asset with rate of 
return r. Suppose that there exists an income i(.) which is divided into savings 
s(.) and consumption c(.): i(t) = s(t) + c(t). Then, the wealth dynamics are 
given by: 

z(t) = re(t) + s(t). (6.2) 
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Assume that, at a given initial date, the wealth x(0) is equal to zo, and that 
the goal at horizon T is to hold xr. Suppose that U (t, x, y) = e~ u(x), where 
p denotes a psychological discount factor, and u(.) the instantaneous utility 
(i.e., on consumption). Then, the optimization problem is: 


T 
max | e u(t, x(t))dt, 
0 
under: x(0) = 2 and 2(T) = zr. (6.3) 


Since c(t) = i(t) + ra(t) — x(t), then U(t, x,y) = e~'u(i(t) + rx — y). I 


6.1.1.2.1 Euler equation We search for a solution z(.) of the previous 
problem (P) (with the same assumptions). 


PROPOSITION 6.1 
(Euler equation) 
If x*(.) is the solution of (P), then it satisfies the equation: 


£ (+ (seo) a ou (aeoo) ' (6.4) 


PROOF Let x*(.) be the solution of (P), and let x(.) be another func- 
tion such that z(0) = (T) = 0. Then, for any s > 0, the function t > 
(x* (t) + sx(t)) satisfies the constraints of (P). Therefore, we have: 


luqa + sx) — U (x*)] < 0. 


s ss 


Taking the limit when s — 0+, we deduce: 


T 
i [ax(t)ae(t) mi KOROJ dt < 0, 
0 


with 


_ ôu 


e Ox 


n oU y 
(s2°.2") s0 =Z (rroo). 
Oy 
Integrating by part and using x«(0) = z(T) = 0, there exists a scalar c such 


that % . 
f K = (c+ f a(u)) du! x(t)dt < 0. 


Now, consider the particular function z(t) = h [B(v) — fy alu)du] dv with 
c such that fo [G(v) — (e+ fo a(u)) du] dv = 0. 
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Then, we have: 
OSA EG aa (+f atu) E aiid [ ko, a <0. 


Thus, necessarily z(t) = 0 and B(t) —a(t) = 0, which is the Euler equation. 


6.1.1.2.2 Application to the consumption-saving plan (see Demange 
and Rochet [156]) 


Since U(t, x,y) = e7*tu [i(t) + ra — y] and (t) = i(t) + rz* (t) — x*(t), we 
get: 


ao (roe w) = re u [e*(t)], 


= COG w) = —e Mu! [e*(t)]. 


Thus, the Euler equation is: 


d —pt, A .* _~ —pt, 17 * 
E (~etu [e*(t)]) = reu [e* (0). 
Then: 


—pu' [e*(t)] + £ (w [E0] = ru [e*(t)] <= 


£ nv) = p-r 
u [e"(t)] = kexp (0-1) 4), 


where k is a non-negative constant. Assuming that u’ has an inverse function 
j(.), we have: 


c(t) = j [kexp[(p —r) t]. (6.5) 


Therefore, the optimal consumption is a monotonous function of current 
time. It is increasing if and only if the interest rate r is higher than the 
psychological discount rate p. 

When the solutions are not necessarily continuously differentiable but “suf- 
ficiently” regular, a modified version of the Euler equation can be deduced. 
Examine solutions which are continuously differentiable, except at a count- 
able number of points where they have left-hand and right-hand derivatives. 
Denote this set by E. 
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PROPOSITION 6.2 
(Piece-wise differentiable solutions) If x*(.) belongs to E, then we have: 


a (re (),2°(0) . i i a0 (sewo) dt+ constant, (6.6) 


with: 
- If x*(.) is differentiable at t, we recover the Euler equation. 


- If x*(.) is not differentiable at t, Z (« x*(t),a* w) is continuous (Erdman- 


Weierstrass condition) 


If the terminal value zr is relaxed, the optimal solution x*(.) obviously cor- 
responds to a solution of the previous problem (P) with wr = 2*(T). There- 
fore, this solution must satisfy the Euler and Erdman-Weierstrass conditions. 
Nevertheless, since this terminal value is not fixed, it must be compared with 
all solutions of (P) when zr is varying. We can also introduce an additionnal 
function A(.) defined on the terminal value and continuously differentiable. 
This leads to the following condition: 


PROPOSITION 6.3 
(Transversality condition) 
If «**(.) is solution of the following problem (P'): 


T x 
max U(a(.)) = f U(t, x(t), x(t))dt + A[x(T)], 


under : x(0) = zo, (6.7) 
then 
OU ek s% 4 l Tt = T OU. att Po 
Da («2 (t), x w) = —A' |x**(t)| I AE (« (t), w) dt, (6.8) 
with 


- If x**(.) is differentiable at t, we recover the Euler equation: 


£ (Z (seraa) = ae (serao) l (6.9) 


- If x**(.) is not differentiable at t, oe (setae) is continuous 


(Erdman- Weierstrass condition). 
- At terminal value: 


oU xk x% È xk = 
oA (ia (t),2 w) +—A’ [x**(t)] = 0. 
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PROOF The proof is similar to the previous one: let x**(.) be a solution 
of problem (P’), and «(.) be a function (continuously differentiable except at 
a countable number of points) such that «(0) = 0. Then, for any s > 0, we 
have: 


L JU (a*" + sx) — U (x**)| < 0. 


PROOF Taking the limit when s — 0*, we deduce: 


T 
| [art + OO] dt + y2(7) <0, 
0 


with 


a(t) =F (nara) O = E roO); =e). 


Integrating by part, we have: 


[ K - (7 $; | atu) du! ER 


Now, consider the particular function x(t) = ig [G(v) — fy o(u)du] dv, we 
ae 
have: fe. [zo] dt <0. 


Thus, necessarily x(t) = 0 and B(t) = a(t), which proves the result. 


REMARK 6.1 Ifthe time horizon is no longer fixed, then the transver- 
sality condition is modified (see Seierstadt and Sydseter [459]). If the horizon 
T is infinite, the problem is more involved, since the transversality condition 
“at infinity” may be not necessary. | 


Example 6.2 Consumption-saving problem without fixed wealth at 
maturity 


(see [156]) 
T 
max U(x(.)) = f eu (i(t) +r2(t) — w(t) dt + A[z(T)], 
under : 2(0) = 20, (6.10) 


The Euler equation is valid at every point where the control x(.) is contin- 
uous. Then: 
u'(c™*(t)) = Lexp [(p — r) t]. 
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The solution is similar to the previous one when the terminal value is fixed. 
However, the two constants k and | differ. 


Suppose that u(c) = Inc and A(x) = exp (—pT) ln z. Then, 
e(t) = Zexp (o—1r) #] and ea = Fexp [(p— 7) 
For both cases c(t) = c*(t) and c(t) = c**(t), the wealth z(.) is given by: 


x(t) = roet) +f [i(s) = c(s)] et) ds, 





i 1 
= goe PT) +f [i(s)] e" 9ds + —e" (1 — g ; 
0 hp 


where h = k or l, according to the optimization problem. 
Denote by Vr the discounted total wealth: 


t 
Vr = zoe T) + J [i(s)] eds. 
0 
- For (P), we get: 
1 
whe te =Vr+—e™ (1 — e~”) ; 


kp 
- For (P’), we get: 








ðU ie ave 
Ae" ()) + a (serao) a, 
Thus, 

Laet = Vr + m (1- ere?) 

l lp l 
Finally, the first problem has a rational solution (k > 0) if the discounted 


total wealth Vr is higher than the discounted wealth rhe "T fixed at time T. 
Then, 


1 (Vr — Xre™T) p 


k è (=e) 


For the second problem, we deduce: 


1 Vrp 


l (p=1)eT+1 
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6.1.2 Pontryagin and Bellman principles 


Assume now that the time derivative x(.) is no longer a true control variable, 
but must satisfy an ordinary differential equation (ODE): for t in [0, T], 


x(t) = f (t, x(t), v(t)), (6.11) 


where the variable x(.) is the state variable and v(.) is the control variable. 
Then, the usual optimization problem is: 

- Let [0, T] be the time period. 

- Let U be a function w.r.t. three variables (t, x, v) with real values, assumed 
to be continuously differentiable on [0, T] x R? x R”. 

- Let f be a function w.r.t. three variables (t,x,v) with values in R”, 
assumed to be continuously differentiable on [0, T] x R? x R”. 

- Let A be a function w.r.t. x, with values in R, assumed also to be contin- 
uously differentiable on R4. 

We search for a state function x(.) continuously differentiable on [0,7], 
and a control function x(.), continuous on [0,7] solution of the optimization 
problem (P”): 


T 
max U(2(.)) =f U(t, x(t), v(t))dt + A[x(T)], (6.12) 


under : x(t) = f (t, z(t), v(t)), 
x(0) = zo 
and v(T) € V (set of constraints on control), 


where V is the set of constraints on the control variable, supposed to be an 
open convex subset of R° . 


When f(t,2,v) = v, the optimization problem is the previous calculus of 
variations. Otherwise, another method must be introduced. Two “indepen- 
dent” approaches have been proposed: 


e The Pontryagin method considers that Problem (P”) is a calculus of 
variations with an infinite number of constraints corresponding to the 
state equation (6.11). 


e The Bellman approach uses the dynamic programming principle. 
However, both use the notion of Hamiltonian of P”: 
DEFINITION 6.1 The Hamiltonian of P” is defined by: for any (t, x, p, v) 
in (0, T] x R? x RÌ x RË, 
H(t, T, P, v) = U(t, T, v) + (p, f(t, T, v)) $ (6.13) 


where (.,.) denotes the scalar product on Rê. 
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6.1.2.1 The Pontryagin principle 


This can be deduced from the Lagrange theorem and the Euler equation, 
as shown in what follows. For this purpose, define the Lagrange multiplier 
p(t) associated to the constraint (6.11) at current time t. The Lagrangian £ 
of P” is: 

T . 
p= J [U C, 2(0), oE) + WE), F(t, 2C), 0) — (DO, 0Y] dt + Ae), 
0 
(6.14) 
which is equivalent to: 


T 
[= J [H(t.2(t), p) v) — (p, 2Y] dt + A eT). (6.15) 
0 
Therefore, the maximization w.r.t. (x,a,v) can be decomposed into two 
steps: 


e Step 1: Maximization w.r.t. v. 


Any optimal control must satisfy the following condition: 


u*(t) = arg max H(t, x(t), p(t), v(t)), Vt a.s. 


Denote H*(t, x(t), p(t), v(t)) the maximum of the Hamiltonian. 


e Step 2: Maximization w.r.t. (a, 2). 


We have to solve the following problem of calculus of variations: for 
(0) = zo, 


T 
max T [H t, x(t), p(t), vE) — (P0), 0Y] dt + A ler). 
0 
Since the terminal value zy is not fixed, the optimality conditions are 
given by Proposition 6.4: 


- If v(.) is continuous at t, we recover the Euler equation: 


p(t) = -E t, ati), r(0). (6.16) 





- If v(.) is discontinuous at t, t — p(t) is continuous (Erdman-Weierstrass 
condition). 


- At terminal value: (transversality condition) 


p(T) = A’ (a(T)). 


Dynamic programming optimization 177 








a(r) = -E (010), 20), (6.17) 
WT) = -ŽE (t, 2(6), wld) (6.18) 


with p(T) = A’(ax(t)) and x(0) = xo. 


Note that: 
- When H* does not depend on current time, the Hamiltonian is constant 
for solutions of: 
d _ 0H*. ƏH*. 


5 UF EOD = Goa + P 








- The boundary condition implies that both the initial values of x(.) and 
the terminal value of p(.) are fixed. 


THEOREM 6.1 Pontryagin principle 


Assume that: 

- The utility function U, the function f, and the function A are continuously 
differentiable on their respective domains. 

- In addition, the function f is bounded and Lipschitzian w.r.t. x, uniformly 
w.r.t. (t,v). 

- The set V is a convex and compact subset of RY. 

Then, if (x*(.),v*(.)) is solution of P”, there exists p*(.) , belonging to E, 
such, that: 

1) v*(t) = arg max, H(t, x*(t),p*(t),v), Vt a.s. Denote this maximum by 
H*(t,2"(),p"(),0). 

2) The pair (x*,p*) is the solution of the Hamilton-Jacobi equation: 





ta) = -H (t, x(t), p(t)) , a p(T) = A’(x(t)), 
tie = 28 (4, 0(0),p(0)) x(0) = ao. (6:19) 


REMARK 6.2 This theorem is applied through two steps: 

1) Hamiltonian maximization: we search for v*(t, x(t), p(t)), which deter- 
mines H*(t, x(t), p(t), v). 

2) Hamilton-Jacobi equations solution: we search for (a*, p*), which deter- 
mines v* (t, £* (t), p*(t)). 1 
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Example 6.3 Deterministic portfolio optimization with transaction 
costs 
(see [156]) 

Assume that an individual invests his wealth on cash with a rate of return 
r(t), and stock with price S(t) and dividend rate d(t). 

Each transaction (buying or selling) has a proportional cost 6 and there 
exists an upper bound N on the number of traded stocks. 

Denote: C(t) the cash amount, n(t) the number of purchased stocks, and 
q(t) the number of sold stocks. 

The state variable is z(t) = (C(t), n(t)) and the control variable is v(t) = 
q(t). 

The optimization problem is: 

max (C(T) + n(T)5(T)) 


C=rV+dn+S(q- 90 lql), 


n= —q, 


C(0) = Co and n(0) = po, 
q € [-N, N]. 


The Hamiltonian does not depend on current time and is given by: 
H(x,p,v) = H(C,n, pı, p2, q) = pı (rV + dn + S(q — @|q|)) — pag. 


Step 1: Hamiltonian maximization, w.r.t. q on [—N, N]. 
The solution is given by: 


gq’ =N if pı S(1-— 90) > po, 
q* = 0 if pS (1—0) < pa < pıS(1 +0), 
gq’ =—N if po > pıS(1 +0), 


and 
H*(x,p) = pı (rV + dn) + N max (p1 S(1 — 0) — p2, p2 — piS(1 + 8),0). 


Step 2: Hamilton-Jacobi equations solution. 


a = Pi; with f pı(T)=1, 
P 


py = -34 = —dpı, 2(T) = S(T). 





The solution is given by: 
f pı = exp i r(s)ds| ; 
p2 = S(T) + f d(s)pı(s)ds. 


Therefore, pı (t) can be viewed as the value at time T of one monetary unit 
invested on time t, invested on the cash. po(t) is the value at time T of one 
stock invested from t to T. 
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The optimal control is “bang-bang” : it can have three values —N,0, or N 
and is piece-wise constant. Its value depends on the ratio T = po/p1S, which 
is the cuurent value of one monetary unit invested on stock from t to T. We 


have: 
_ ox ff r(s)as] (S0) + SP dop (ods) 
t pate ube eS Ne a A 
É sO 
Thus, the optimal control satisfies: 


g =N if r(t)< (1-0), 
q =Oif (1-6) <7(t) < (1+0), 
qi =—N if r(t) > (1+)8. 


Note that if there exists no friction (0 = 0 and N = oo), then 7(t) must be 
equal to 1 in order to have a solution. In that case, we recover the standard 


no-arbitrage valuation: 
T T 
SO en - f Hads (s+ I asin). 





6.1.2.2 The Bellman principle 


The Bellman approach considers that problem (P”) can be embedded in a 
more general class, parametrized by the initial date tọ and the initial state 
value £+, solved by the dynamic programming principle. 

Consider the functional 7 defined by: 


T 
T (to, Tto) = max f U(t, x(t), v(t))dt + A[x(T)], (6.20) 
under : 
x(t) = f (t, e(t), v(t)) if v(.) is continuous at t, (6.21) 


x(to) = zo and v(T)E V, 
where VY is the set of piece-wise continuous functions. 
- Assume that the ODE x(t) = f (t, x(t), v(t)) has one and only one solution. 


- Suppose that (to, £t) belongs to the domain D where J is defined. Con- 
sider the subdivision of the interval [to, T] into [to, to + h| and [to + h, T]. 
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The optimization problem can be divided into two steps: 


- Maximization on [to +h, T], knowing the state value at time to +h. Denote 
the optimal value by J (to + h, £to+h). 


- Maximization on [tp,t) + h|, taking account of its impact on J(to + 
h, Lto+h)- 


This leads to the dynamic programming principle: 


toth 
I toy aio) = max / U(t, x(t), v(é)\dt + T(to +h, tiptn). (6.22) 
VU. to 
If (to, £to) belongs to the interior of the domain D, J (to, £to) exists for 
sufficient small values of h and we have: 


toth 
/ U(t, w(t), v(0))dt + J (bo + hs, epan) — E 


to 


1 
0 = max — 
v(.) h | 


(6.23) 
Assuming that J is differentiable at (to, £to), we deduce (for h — 0*): 


o o 
0 = ted (toa) + gI Eo Tto) Te gg I (to Tto) f (tos z1: 0)| ’ 





o o 
0 2 gI o Tto) + oi [Coa Tv (Z Ito 2u) (tos 21a:0))| i 


Therefore, using the Hamiltonian, we have: 


o 
gg I E tto). (6.24) 


Under mild assumptions, the reverse inequality can be proved. Thus, for 
varying t and x, J satisfies the Bellman partial derivative equation (PDE). 


o 
0 2 gI Eo Tto) a H* (to, £to, 


THEOREM 6.2 Bellman PDE 


If the value function J associated to problem (P” ) is defined and continu- 
ously differentiable on \to,T|xO where O is a convex open subset of RÌ, then 
it satisfies the Bellman PDE: 

For any (t,x) €Jto, T[xO, 


ð ş ð ae eer 2 
gI”) + H* (t,x, gz Ito) = 0 with J (T, x) = A(x). (6.25) 
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REMARK 6.3 The solution is deduced through two steps: 
- Hamiltonian maximization. 
- Bellman PDE solution. 


The Bellman approach allows recovery of the maximum principle and the 
Hamilton-Jacobi equations, introduced in the Pontryagin approach. 


Indeed, consider p(t) = ZI (t, 2(t)) and assume that J is twice-continuously 
differentiable. Then, we have to check that p(.) is solution of the Hamilton- 
Jacobi ODE, and also that p(.) satisfies the transversality condition. 

1) p(.) is solution of the Hamilton-Jacobi ODE: 

Since p(t) = ZI lt, x(t)), by differentiating p;(t), we get: 

d [OT PF 
s(t) = —|— (t, 2(t))| = x; (t). 
pi) = 5 | 37 ato] = Aat Doe a z(t) (8) 


Differentiating the Bellman equation oat + H* (t,x, 92) = 0 w.r.t. zi, we 
deduce: 





EN Da I ƏH* _ 
Joz, + OajOx5 Ox, 


Finally, since fj(t, x, v*) = 2 (t,x, 22) and x; = f;(t,x,v*), the result is 
proved: 
OH* 


Bilt) = - 5 (al), pO). 


2) p(T) satisfies the transversality condition: 





Differentiating w.r.t. xi, the terminal value of Bellman equation, we have: 








OF _ OA 
which implies: 
G oee 


REMARK 6.4 The Bellman method allows us to more easily find the 
solution when its form is anticipated. Otherwise, computation difficulties of 
the Bellman PDE and Hamilton-Jacobi ODE are often similar. 

[ 
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6.1.3 Stochastic optimal control 
6.1.3.1 Introduction to stochastic optimal control 


The theory of stochastic optimal control extends previous methods to the 
stochastic dynamical systems. For basic notations, definitions and properties 
of stochastic processes, we refer to Karatzas and Shreve ([320],{321]) and Jacod 
and Shiryaev [294]. Appendix B provides a short survey about these notions. 


Consider the case of systems associated to stochastic differential equations 
(SDE): 
AX; = f(t, X;)dt + g(t, Xe) dW; with Xto = To, (6.26) 


where X; is the state variable with values in R4, v(t) is the control variable 
with values in R@ , and W is a d” —dimensional Brownian motion with d” < d. 


The functions f(.,.) and g(.,.) are assumed to satisfy usual conditions in 
order to ensure that the previous SDE has one and only one solution. For 
example: 

- The functions f(.,.) and g(.,.) have linear growth: 


IE L) < A+ lel) Me and |g, x)|| < (1 + llel) Me, 


where M(.) is a positive determistic function, upper bounded on each com- 
pact subset of [0, T]. 


- The functions f(.,.) and g(.,.) are Lipschitzian: 


FE x) = FW] S (a — yll) K and (lgt, x) — g(t, yll s (Ile — yl) Ke, 


where K(.) is also a positive determistic function, upper bounded on each 
compact subset of [0, T]. 


Due to the observation of the path on time period [0,T], the acquired 
information may modify the control variable v(.). Thus, we assume that v(.) is 
now a stochastic process (v(t, w))+. Furthermore, if v(.) is supposed to be such 
that v(t, w) = 0(t, X:(w)), where Ù is a deterministic function © : [0,T] x R4 
— R”, then v(.) is said to be “feedback” and the system is Markovian: 


aX, = f(t, v(t, X;))dt + g(t, V(t, X+))dW: with Xto = To. (6.27) 
The information is modelled by the filtration FX generated by process X. 


The matrix g(t, x) is assumed to satisfy the following: There exists € > 0 
such that Trace (*g(t, x)g(t,£)) > £ (non-degenerated). 
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Under previous assumptions, we are led to the following stochastic control 
problem Pe: 











max E 


v(.) 
under, for to < t <T, 
dX, = f(t, v(t))dt + g(t, v(t)dWz, (6.29) 
X(to) = zo and v(t) = v(t, Xi(w)) E€ V as. , 





T 
/ U(t, X(t), v(t))dt + A[X(T)] |X (to) = zo | (6.28) 


to 





. t 
where V is an open convex subset of R4 . 


The objective function U is assumed to be twice-differentiable on |to, T] x 
PJ 
Ri x R? , and A is twice-differentiable on R?. 


The maximization is made on functions v which are piece-wise continuous 
w.r.t. the current time t and Lipschitzian w.r.t. the state variable z. 


Problem (P. ) is embedded in the class parametrized by (t,x). Thus, we can 
apply the Bellman approach. For this purpose, we define the new Hamiltonian. 


DEFINITION 6.2 The Hamiltonian H(t, x, p,q,v) associated to problem 
(Pe ) is given by: for (t,x,p,q,v) € [to, T] x R? x R? x (R? x RÌ) x RY, 


d d d” 
1 
H(t, £, p,q, v) =U(t2,v)+) pi filz v+ J dit | X oile vosle, v) 
i=1 i,l=1 j=1 


(6.30) 


With respect to the deterministic case, any pair of state variables (x;, 27) is 
associated a new variable q;,; with a coefficient equal to half the instantaneous 
covariance of the random variables X; and X;. This term is due to the Dynkin 
operator of the process X, denoted by D(X). Note that D(X) can be viewed 
as a “mean or expectation of variation rate” of the process X, “roughly” equal 
to the ordinary derivative w.r.t. current time as for deterministic functions: 





[dX |F:] 
dt 











The Hamiltonian can also be expressed as: 


U(t,x,v) + (p, f(a,v)) + 5Trace (g(t, x)q’ g(t, x)) 4 
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6.1.3.2 The stochastic Bellman principle 


We apply the dynamic programming approach to the stochastic framework. 
Denote J (to, £o) as the value function associated to problem Pe. 
Assume that (to, £o) belongs to the interior domain of the operator J. 


This leads to the stochastic dynamic programming principle: 














J (to, £to) = max E | 


to+h 
naxe | f UE X(@,vl0)dt + Ilo +h Xin) Xu = z 


























to 
(6.31) 
Then: 
iE | oe U(t, x(t), v(t))dt |X,, = xo | + 
eran , (6.32) 
“ (44E [7 (to +h, Xtotn) — J (to, Xio) [Xto = 20] 


where the optimum is searched on the set of feedback controls v, and the 
process X is solution of the SDE: 


dX; = f(t, olt, X¢))dt + g(t, 0(t, X,)dW; with Xto = To. (6.33) 


For h — 0*, we deduce: 














; 1 
„lim. h = U (to, Xto; v). 


to+h 
p U(t, (t), v(t))dt |Xi = xo 





Then, applying Ito’s lemma, we get: 














matr 
imi, h uy [I (to + h, Xto+h) = J (to, Xto) Xt = Xo | = DI (to, Xto). 


Thus, we have: 


O o 
0 > JO Utto eo) + v) + (tot) + (Sota), f (tos 21:0) )| 


1 
+5Trace (g(to, Tto, V)*g(to, Tto, v)) 


Using mild assumptions, the equality is proved. 


Set 
2 


to, Tto): 


E 
Ox Ox? 


Therefore, using the Hamiltonian, we are led to the stochastic Bellman 
equation. 


0 0 
p= ae and q = 


OF ; aT 227 
0= pz (tor tto) + H (to, £to, F (to, Tto), Da2 = (to, £to)), (6.34) 
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where 


o o? 
S (to, t10), Fs (o: 2w) 


H* (to, £to, zx 


oT OF: 
= maze) H (to, Lig, —— Ox = (to, Tt), yz (0 Tto), v). 
Thus, for varying t and x, J satisfies the stochastic Bellman equation. 
THEOREM 6.3 Stochastic Bellman equation 
If the value function J associated to problem (Pe) is defined and continu- 


ously differentiable on \to,T|xO where O is a convex open subset of RÌ, then 
it satisfies the Bellman equation: for any (t,x) €]to, T[|xO, 


OT OT PI 
aE — (t,x) + H* (t,x, ae — (t,x), Baz 2) = 0 with J (T, x) = A(x). (6.35) 


REMARK 6.5 The solution is again deduced through two steps: 
- Hamiltonian maximization: 
H* (t,£, p,q) = ma H(t, x, p,q, v), and 
- Solution of the Bellman equation: 


For a solution v*(.) of previous optimization problem, we have to solve the 
Bellman PDE: 


aT a ðI, BI sss he 7 
pr t”) + H (t,x, Bq (bh), yz ©) E 0 with I(T, x) — A(z). 


Under previous assumptions, there exists a maximal domain on which there 
exists one and only one solution. However, solutions are rarely explicit. 
For stationary problems, explicit solutions may be determined. 


Consider, for example, the case of an infinite horizon and suppose that the 
investor has an exponential utility function (CARA utility). 
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Example 6.4 
Suppose that T = œ and U is such that: 


U(t,a,v) =e "u(a, v), 
with p > 0. 
Suppose also that d = 1 (univariate case). 


Consider 














T(t, x) = ma E | | i eul X (s), 0(s))ds |X, =x], (6.36) 


with 
dX, = f(Xs,0(s, Xs))ds + g(Xs, 0(s, Xs))dWs, (6.37) 


X(t) = x and v(s) = ù(s, X.) E V. 


Then, if J (0, x) = I(x) is defined and twice-continuously differentiable on 
the open subset O, J (t, x) is defined on the whole space R x O and we have: 


For any (t,x) € Rx O, 


T(t, x) =e "I (x), 


where I is solution of the ODE: 


plI(x) = max u(a,v) + I(x) f(a, v) + 7 (x)9?(x, v) 
[ 


The next section illustrates this approach to determine optimal financial 
portfolios. 
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6.2 Lifetime portfolio selection 
6.2.1 The optimization problem 


In what follows, we consider the approach introduced by Samuelson [447], 
and further studied by Merton ([385], [386] and [387]), who used dynamic 
programming methods in order to deduce explicit solutions when security pa- 
rameters are deterministic. A more general solution is also examined when 
security parameters are no longer deterministic. 


Assumptions (S) on securities: 


- Let S be the price vector of d securities. Let (Q, P) represent the probabil- 
ity space. Assume that it is defined from the following stochastic differential 
equation (SDE): 


dS; = Si Š (u(t, S;)dt + a(t, S)dW;) ; (6.38) 
where W = (W1, ..., Wn) is a d—multidimensional Brownian motion such that 
Vi Æ j, Cou(Wit, Wjt) = pit and Var(Wi t) = t. (6.39) 


- The information is modelled by the filtration F; generated by the Brow- 
nian motion (and, as usual, completed in order to contain all P-null sets). 


- The time horizon is supposed to be finite and is denoted by T. 


- The processes u(.,.) = (H1, ---, Ha)(-), which model the instantaneous 
expectations, and o(.,.) = (01, ...,0a)(.,-), which model the volatilities, satisfy 
usual conditions which ensure that the previous SDE has one and only one 
solution (see for example [294]). For instance: 


i) These processes are measurable, F;-adapted and uniformly bounded on 
[0,7] x Q. 


ii) The matrix o(t,.) is invertible with bounded inverse for any t € [0, T]. 
The process ø is also predictable. 


REMARK 6.6 Under the previous hypothesis, the financial market is 
complete and without arbitrage opportunity. 
0 
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Assumptions (P) on portfolio strategies: 


- At any time t, the investor chooses the positive amount c+ per time unit, 
which is assigned to his consumption, and also the portfolio weighting w+. 


- The investor’s strategy is supposed to be self-financing and the cumulative 
consumption ‘he csds is an F;-adapted-process with i csds < œœ, P-a.s. 


- The portfolio weighting w is predictable and such that So ||ws||? ds < œœ, 
P-a.s., where ||.|| denotes the norm. 


Thus, the portfolio value V; is an Ito process defined by: 


ipi dS; t 
Vi = Vo +> f Wi,eVa-g -f cds. (6.40) 
i=1 "9 48 0 


Assumptions (U) on utility functions: 


For any time t, let U(.,¢) and UGH be two utility functions satisfying: 
vt € [0, T], 


- U(.,t) and U(.,t) are defined on R+, strictly concave, non-decreasing and 
continuously differentiable. 


- limg soo ZU(a, t) = 0 and limy 0 ZU(.,t) = 0. 


- U(.,.) and U(.,.) are continuous on R+ x [0, T] (for example: U(z,t) = 
e u(x) with p positive scalar). 


Note that, under the first assumption, the marginal utilities ZU(z, t) and 
ZU (., t) are non-decreasing. Therefore, their inverse functions J and J exist. 


The maximization of the expected intertemporal utility along the time pe- 
riod [0,7] is the following optimization problem: 





max E 
GAY 











[ U(cs,s)ds + Ü (Vr,T)| . (6.41) 
0 





There exists one constraint corresponding to the initial budget: at time 
t = 0, the portfolio value V is equal to a given value Vo. 


6.2.2 The deterministic coefficients case 


The functional to be optimized is time-additive and strategies are assumed 
to be non-anticipative (i.e., they are functions of past or current information 
and not of future observations). 
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Therefore, as seen in the previous section, the solution can be determined 
through dynamic programming: “an optimal strategy on the whole time pe- 


riod [0,7] must be optimal on any sub-period |t, T]. ” 


Then, at any time t, the consumption rate process C and the weighting 
process w are solutions of the following problem: 


- Consider the functional J (“the value-function”) given by: 














T 
GV Bi) Saad E; f Heet T (Vp,T)| , (6.42) 
cw t 














where E; denotes the conditional expectation given the information at time t. 





- Consider the functional ® defined by: 
®(c,w,V,S,t) = Ula, t) + D(J), (6.43) 


where D denotes the Dynkin operator associated to price variable S and with 
value V for given control variables c and w: 


a (< a. ð 
D= J F (>: Wit Hit Ve — a) V + À Mit ite (6.44) 


i=l 


2 


d d a? | ee ð 
-=> 5 Oi jt Wi WVE Bos + 3 >, > Fijt Sit 55,tVe 3S ðS; 


i=1 j=1 i=1 j=1 jt 


| 
hole 


2 


d 


Using standard results of dynamic programming (see Kushner [340]), we 
deduce: 





PROPOSITION 6.4 

If U is strictly concave w.r.t. c, and if U is concave w.r.t V, then there 
d 

exists a set of optimal controls w* and œ such that: > wi = 1, I (V, S,T) = 
i=1 


U (Vr,T), and for any t in [0, T], 
0 = ®(c*, w*, V,S,t) > ®(c, w, V, S, t). (6.45) 


This first result shows that the optimization problem is equivalent to the 
maximization of the functional ®(c,w,V,S,t) under the following constraint 
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d 
on weights w: ` wit = 1. This can be done by using the Lagrangian: 


i=l 


d 
L= ®(c,w,V,S,t) +A (1-Z w) ; 


i=l 


where A denotes the usual Lagrange parameter. Under previous assumptions, 
the first-order conditions are given by: For any i € {1,..., d}, 


0 = L-(c*,w*) = U-(c", t) — Jv, (6.46) 


d d 
0= Lw,(c*,w*) = —à + JymiV + Ivv Soul? + XC Ja,v 0i3S;V, 
j=l j=l 


d 
O=1-) win. (6.47) 
i=1 


(notation: Ax denotes the partial derivative of A w.r.t. X and Axy the 
partial derivative of order 2 w.r.t. X and Y). 

Note that Lee = Uce < 0, Lew, = 9, Lugu = o? V? Iv, Lowi; = oij V? Ivv, 
and that the volatility matrix |ø; j]ij is non-degenerate. Therefore, a suffi- 
cient condition for the existence of an interior solution is: Jyy < 0 (meaning 
that J is strictly concave w.r.t. V). By differentiating the first previous e- 
quation, we deduce that the optimal consumption is an increasing function 
w.r.t. V. 


Next, we must determinate c*,w*, and A as functions of S, V, and t 
and the derivatives of J by solving (n + 2) implicit equations. Then, we 
have to substitute these solutions into Equation (6.45), in order to get a d- 
ifferential equation of the second order w.r.t. J with boundary condition: 
I (V, S,T) = U (Vr,T). 


Denote by J, the inverse of the marginal utility U” w.r.t. the consumption 
I= (UNE: (6.48) 

From Equation (6.46), we deduce: 
= J(Jv,t). (6.49) 


The optimal weights w are determined from a linear system, which allows 
us to get explicit solutions. For this purpose, denote: 


XS loil; ET = [Pi ilis . and T = os 


4=1 g=1 
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By eliminating the Lagrange parameter in the second equation (6.46), we 
determine the weights w; : 








Wit = hi(S¢, t) oh m(St, Vi, t)gi(S¢, t) + fil Si, Vi, t), (6.50) 
with 
d d 
Viet = > eh 
i=l i=l i=l 
and 
dn. F 
hi(Se t) = X E; mM(Se, Va t) = -5 
(S11) in (S Vit) =- 
d d 
gil St, t) rE (w- DODI vs) , 
h=1 l=1 
1 
46, VS aT eye = 94 
Fi( Se, Vi, t) Isiv Eers Dra VA 


From previous expressions of C* and w*, we deduce the second-order dif- 
ferential equation satisfied by J. Its coefficients are functions of St, Vs, and 
t: 


d 
XO vi jhi tV 


Me 


0=U(I(Iv,t)) + A+ Iv = — J(Jv,t) (6.51) 
d 1 d d V, d 
$ 
+ >, Hitsi t Is: + z 3 2 Oi jt Sit Sjt ISiS; + T 2 Sj t ISV 








I d d Tyy V? 
EA e nS oat ion 





i=l i=l j=l h=1 
1 d d d 2 
DPI 5 x Poi j tsi Sjt Isi v Is v — (>: Sis) 
WV a= i=1 
2 
oe d d d d 
A NOYO Dhi jthithjt — XOY oi gthint 
VV argel i=l j=l 


In addition, the functional J satisfies the equation: 


I(Vr, Sr, T) =U (Vr, T). 
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Then, the solution J* of Equation (6.51) is introduced in equations (6.49) 
and (6.50), which allows us to deduce c* and w*. 


For the general case, the solutions of these equations are not explicit. More- 
over, for a large number of securities, numerical computations of solutions are 
tedious. 


However, for some cases, explicit solutions can be identified and thus more 
easily analyzed; for example, if securities are defined from a Brownian multidi- 
mensional process with constant coefficients (in that case, they have lognormal 
distributions). 


Assume that functions p(.,.) = (f41,...,a)(.,.) and o(.,.) = (o1,...,0a)(.,-) 
are constant. Equation (6.51) is: 





d d 
D 2 Yi, Hi,t 
0 = UHI D) + ht Iv | =- I(Wv, 8) (6.52) 
2 
TSS >> 
-~—t Trigai Vij ,t i,t 
aby t=1 j=l i=1 j=l 


The solution J depends only on V, not on t or S. In that case, optimal 
weights w* satisfy: 


wie = hit m(V;, t)gi, (6.53) 
where h; and g; are constant. 


Therefore, at any time t, the weight w;, belongs to a straight line included 


d 
in the hyperplane ` w; = 1. The term m(V;,t) looks like a parameter which 
i=1 
determines the position of w7, on this line. It depends on the utility function, 
contrary to the terms h; and gi. 


PROPOSITION 6.5 


Under previous assumptions, the optimal solution w* has the following de- 
composition: There exist two mutual-funds, independent from the investor’s 
preferences, such that the investor is indifferent between a combination of these 
two funds and a combination of the d given securities. These two funds are 
characterized (up to a multiplicative coefficient) by their respective weighting 
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a and P such that for any i € {1,..., d}: 


1— 
ay = hi + — 94, (6.54) 
pi = hi — Tgi, (6.55) 


where ņn and v are arbitrary constant parameters (v # 0). 
The solution w* is such that there exists a scalar a(V;,t) satisfying 


a(Vı, t) = vm(V;, t) +n, 


and 


wie = Vi, thas + [1 — a(Vi, t)] 6i. (6.56) 


If there exists a riskless asset So with rate of return r, the proportions are 
given by: for any i € {1,...,d}, 





l-7 2 
ai = — 2M (u; — 1), (6.57) 
d 
Bi = —2 SP vyl- r), (6.58) 
j=l 
d d 
ao =1-S a; et Bo =1- 5) Gy. (6.59) 
j=l j=l 


REMARK 6.7 For a financial market with a return vector having a 
Gaussian distribution, two mutual funds are sufficient to generate all optimal 
portfolios. This dynamic two-fund separation property is analogous with the 
Tobin-Markowitz separation property, proved in the mean-variance framework 
(see Chapter 3). When a riskless asset exists, it can be considered as the first 
mutual fund. Note that the value of the second one also has a lognormal 
distribution. : 


When considering HARA utility functions, the optimal solution is an explic- 
it function of the utility function parameters. Assume, to simplify, that there 
exist two securities, one riskless and one risky. Consider an intertemporal 
utility defined as a function of the consumption c of the following type: 

A “A 1- 
U(c,t) = exp(—pt) x U (c) with U (c) = 2 (= 
Y a 





+ n) : , (6.60) 


where p > 0,7 #1, 8 >0, ££ +n >0, 9 =1 if y = —00. 
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The optimal solutions of Equation (6.52) are given by: 


y ERTE ee = 5 
I(V,t) = a ae [A (6.61) 
y p= yv 
V y : 
x (5 + ar [1 — exp |-r(T — ol) 
where ô = 1 — y and v =r + (u — r)? /280°. 
Note that, for y > 1, the solution holds only for 


Vi < (y — 1)n [1 — exp [~r (T — t)]] /Gr. 





The optimal consumption and weighting are respectively given by: 


(p—w) (Vi + l- expj=rT-01) ön 





q (V) = ohe =) T7 (6.62) 
and, 
wi (Vi) V: = EV: + m [1 — exp [-r(T — +). (6.63) 


REMARK 6.8 The optimal consumption c* and amounts w*V are linear 
functions of the wealth V. The value process V also has coefficients which are 
deterministic functions w.r.t. the current time. Note that, when the vector 
of security logreturns is Gaussian, the HARA utility functions are the only 
utility functions for which these linearity properties are valid. 1 


The final step is to search for the optimal wealth V*. It is determined by 
using previous solutions c* and w* of Equation 6.62: 


dV, = ([wf (u — r) +r] Vi — cf) dt + ows dW. 


Then, setting 


X= Vet FUL exp[-r(t - 9), 


we have a stochastic process (Xz); which is solution of the following standard 


SDE: 
dX: 


Xt 
where a and b are constant. The solution X is a geometric Brownian mo- 
tion. Then, its probability distribution is lognormal. We deduce the optimal 


= adt + bW, 
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portfolio value V*: 








Vi = XP - gE l- exp[=r(T - 8), (6.64) 
with 
pA PZW en: p-r 
X; = Xoexp (> j +(1 21) S552 ) t+ -= w: 


7 1 — exp [=t -T)] 
1 — exp |- e- (T)] 





(6.65) 





Then, the optimal consumption and allocations are functions c/(V;*) and 
wi Vi"). 


Examine the following two particular cases: 


e Case 1: (logarithmic utility) (which results from the limit case where 
y=n=0,and G=1-y=1) 


GV) = Vi /(T — t) and w% V = | So vis(uj—r) | Vr. (6.66) 
j=l 


e Case 2: (CARA utility) (U (c, t) = — exp(—7c)/n)) 





* * * 1 (u = r)? *Į7* (u = r) 
aV) = rv, aap [a and wš Vř = ee (6.67) 


6.2.3 The general case 


Consider the general financial model with all assumptions (A),(P), and (U). 
This model has been examined by Karatzas et al. [318], using the Bellman ap- 
proach developed in the previous section. However, since the financial market 
is complete, another approach, based on the representation theorem for mar- 
tingales, can be used to solve the optimization problem, as shown by Lehoczky 
et al. [350]. 


The investor searches for the solution of 


max E 


T 
n | U(C,,8)ds+U (Vr, T) |, (6.68) 
Ww 0 





under a positivity constraint on the wealth process V. 
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6.2.3.1 The positivity constraint 


Indeed, the “infinite-dimensional” constraint V; > 0 is not easy to check. 
Thus, we can try to “reduce” it to one-dimension. For this purpose, introduce 
the risk-neutral probability Q with Radon-Nikodym derivative L defined by: 


=| RA] =|| now- f moal, 60 


where: (I denotes the vector with all components equal to 1) 


n(t) = — [oE] (u(t) — r(e) . (6.70) 


Under assumptions (A), the process L is bounded and the Girsanov theo- 
rem can be used. 














The process W, defined by W, = W, — So n(s)ds, is a Brownian motion 
w.r.t. the risk-neutral probability Q and the filtration F;. Under Q, the asset 
prices are solution of: 

igg (reat + alt, S,)dW,) ; (6.71) 
and the wealth process, associated to strategy (c, w), is given by: 
dvi" = eee (ry - c(t))] dt+ XO wi(t)ois (t, S:)dWj2, (6.72) 
ij 
Vo" (0) = vo. (6.73) 
Denote by R the money market account: 

t 
R; = exp -f r(s)ds| ; (6.74) 

0 


Then 


Ve" Re =w- | Rlejelsjds+ | R(s\'w(s)o(s)aHV. 


DEFINITION 6.3 A strategy (c, w) is said to be “admissible” for a given 
initial wealth vo, if the process V©® is positive a.s. Denote A(vo) the set of 
such strategies. 


PROPOSITION 6.6 
Let a strategy (c, w) in A(vo) and V®® represent the associated wealth process. 
Then 














Ko 





T 
veer + f ros < vo. (6.75) 
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PROOF If (c, w) is admissible, the process M; = vot fo R(s)'w(s)o(s)dW, 
is a local Q martingale which is equal to V,°" Ry + i R(s)c(s)ds, and so is 
positive. Thus, M is a Q-supermartingale, which implies Eg [Mr] < Mo. 














A converse property can also be proved, using the martingale representa- 
tion. 


PROPOSITION 6.7 
Let c be a consumption process satisfying assumption (S2), and let X be a 
positive random variable Fr-measurable (“contingent claim”) such that: 











EQ 








T 
xRr+ | roy = vo. 


Then, there exists a portfolio weighting w which is predictable such that the 
pair (c,w) is admissible, and the terminal wealth Vp’ is equal to X. 


PROOF 
- First note that, if such weighting w exists, the local Q-martingale 


t 
irent i R(s)*w(s)o(s)di¥,, (6.76) 
0 
is positive since it can be written as: 
t 
M; = VE” Ri +f R(s)c(s)ds. 
0 


The supermartingale M is also a martingale such that: 














M: = Eg [Mr |F+] = Eg 

















T 
VEY Rr + | R(s)e(s)ds |. 


- Therefore, it is sufficient to prove that there exists w satisfying relation 
(6.76). For given pair (c, X), the process 





Mı = Eo 














T 
XRr+ | R(s)c(s)ds | (6.77) 
0 
is a Q- martingale. 


The predictable representation theorem for martingales proves that there 
exists a predictable process 0 such that i \|:|| dt < œ a.s., and 














t 
M: = Mo + i Bai Wik Mis = Big (ME Sang: (6.78) 
0 
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Then, consider the process w given by: for any t € [0, T], 
-1t 


A(t), 


and the process V®*® defined by: 


t ~ 
vR =w- f R(s Jelsjas + f 6(s)dW,. 


We can easily check that the process V®™ is associated to the strategy 
(c, w), which is admissible. 
[ 


6.2.3.2 Existence of an optimal strategy 


The optimization problem is: 









































TE 
ice | CTO ESIC eae (6.79) 
cw 0 
with the budget constraint: 
Ve R(T r+ f Re R(s L(s)ds | < vo. 
Consider the value-function J: 
T 
FOO cid A Swe Ip J U (cs, 8)ds +O (VE",T)|F |. (6.80) 
c, w t 








We first have to search for optimal solution (c*, V*), then to determine w*. 
A useful duality result can be used. 


PROPOSITION 6.8 Legendre transform 
Under the assumptions (U), we have: 
U(A(y)) — vF(y) = max[U (ce) — ey], 
where J denotes the inverse function of U’. Note that 


U(y) = m [U (c) — cy] 


is the convex conjugate function of the investor’s utility. 
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PROOF Since U is concave, we have: 
U(J(y)) — Ue) 2 U"(J(y) (Jy) = ©): 
If J(y) > 0, U'(J(y)) = y. Therefore: 
U(I(y)) — Ue) 2 yy) = 0). 
If J(y) = 0, y > U'(0). Therefore: 
U(0) — U(c) > -U"(0)e > —ye. 


U 


To determine an optimal solution, we can use the Lagrange multipliers. For 
à € Rt, denote: 





Lic, Vr, A) = Ep 


+X (» — Ep 


A sufficient condition for (c*,V*) to be optimal is that there exists a La- 
grange multiplier A* € R* such that (c*, Vf, *) satisfies: for any (c, Vr, A) 
satisfying budget equation 6.75 and \ € Rt, 











T 
| U(cs, 8)ds + U (Vr, T) 
0 























T 
VrR(T)L(T) + j roor) l 
0 


Lic, Vr, A*) < Lic, VE, A“) < L(č, Ve, A). 


The first inequality, which is due to the optimality of (c*, VŽ), can be solved 
by searching for the values (c¥, VŽ) which satisfy: for any (t,w), 


- The consumption cf/maximizes U(c;(w), t) — A* R(t)c:(w) L(t); and, 
- The wealth value V7 maximizes U( Va (w), T) — AM R(T) VF (w)L(T). 


We deduce: 


e = IA (r(t))) and Vp = JA*(K(T))), 
where «(t) = R(t)L(t). 


The Lagrange parameter \* is determined from the budget equation. Its 
existence is deduced by assuming: 


























- (V): For any À > 0, E | i L)R()T(AK(t))at] and E [LTR JAT) 
are finite. 
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Under assumptions (U) and (V), the function F, defined by 











F(y) = Ep 





T ~ 
f K(t)J (yrkt), t)dt + KT) ŽOKT),T)|, 





is continuous and non-increasing on [0,a] where a = inf {y|F(y)=0}. It 
satisfies also: 

lim F(y) = +œ; lim F(y) =0. 

y—0 y—+oo 


Then, the function F has an inverse F~!. Define the function G by: 


G(y) = H(F7*(y)), 


where the function H is given by: 











H(y) = Ep 





T ~ h 
f U(A(yn(t)), Hat + Ü [F (u(t), T) 








Assume also: 


- (W): the utility functions U and U are twice-continuously differentiable 
and U” and U” are increasing. 


Then (see Karatzas et al. [318]): F and G are continuously-differentiable 
and H'(y) = yF" (y). 


To summarize: 


PROPOSITION 6.9 F 
Under assumptions (U),(V), and (W) for utility functions U and U, there 
exists an optimal strategy (c*,w*) € A(vo) such that: 














J(c*,w*,vo) = max J(c,w,vo) (6.81) 
c,wE A(vo) 
T ~ 
where J (c, w, vo) = Ep Í U (cs, s)ds + U (VE”,T)|. (6.82) 
0 





Using previous function F and the inverse functions J and J of marginal 
utility functions U' and U', we have: 


ğ = JE (vo)s(t)), (6.83) 
Vp = J(FO*(vo)K(T)), 


and the optimal weighting w* is deduced from the martingale representation 
(6.78). 
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Example 6.5 


Assuming that the coefficients r, y, and ø are constant, that the utility func- 
tion U is a power function, U(x) = z, and that U(x) = 0. We have: 


1/(y-1) 
1) The inverse function J is given by: J(y) = (4) ee, 


2) The function F~} is such that: 














Te 
F~! (v) = (bv)°" with b= 1/E | J Koro» l 
0 


3) The optimal consumption is given by: 


ct = J(F—(u9)K(t)) = vobr (t)! / 079. 


4) The optimal wealth is null, since the utility function U is itself null. 


5) The optimal weighting is determined from the predictable representation 
of the martingale 





Mi 








Ko 








T t = 

J R(s)c*(s) | = w+ f 0*(s)dW, , 
0 0 

which, by identification, leads to the equality: 


w* (t) = [RE] 0E oH] = etato). 


To determine 0*(t), we note that: 


k(t) =e" exp [nW; — 77t/2] . 
Then, setting 


consider the exponential martingale: 


L; = exp [now — n? 8?t/2] . 
Set 


1 
€=r(l+d)— zrl +6). 
Then, 


Rec* (t) = vobRir(t)? = vobLie™ ®t 
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Thus, the martingale M; can be expressed by using L: 


t T 
M; = vob | Le™S°ds + vob f Q |Z. IF: e “ds. 
0 t 














Since L is a F,-martingale w.r.t. Q, we have: 
tox, a ec ae 
Mi = vob / Le Stds+ i, 
0 


Using dL, = nbLidW;, we deduce: 





bnô p~ —~— 
e lin (es? — Ea] dW. 


dM, = 
Finally, 
w(t) = HAP [fenet _ 1)] aon 
a? (y — 1)Ç i 
and 
y/(y-1) -¢T 
l-—e 
m=(4) 0 = 
y ¢ 


1—e ST \ OD) 
Gw) = w)” (==) . 
Ç 
Note that if U(x) = U(x) = z, we have: 


1/(y-1) 
t 
C= (>) and Vr = 


i=) 


eae | 


>)| 
>) & 


with 











œ} 
II 
js 

eee 





T, 
E | J K(t)7/O-Y dt + cere] yor, 
0 


REMARK 6.9 The previous method relies heavily on martingale meth- 
ods. In Chapter 7, this approach is used to determine optimal portfolio design, 


in particular by using the results of Cox and Huang [132]. 
[ 
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6.2.4 Recursive utility in continuous-time 


Epstein and Zin [212] introduce the notion of recursive preferences which 
generalizes the standard time-separable power utility. This new preference 
modelling allows for the separation of the relative risk aversion from the elas- 
ticity of intertemporal substitution of consumption. Duffie and Epstein [173] 
(see also Svensson [483]) introduce recursive utility in continuous-time. 


Consider the following parametrization of recursive utility: 














Ure.t =E f F (Cs, Ure,s) ds | ; (6.85) 
t 


where f(.,.) is a function which allows for the aggregation of the current 
consumption Cs and the continuation utility Is. 
Duffie and Epstein [173] assume that this function is given by: 


B 
f (C, J) = (1 — y) Ure 
zG) 


Y 


—-| ae. 
(1 = q) Ure) = 





where (3 > 0 denotes the rate of time preference, y > 0 denotes the coefficient 
of relative rsik aversion, and ~ > 0 denotes the elasticity of intertemporal 
substitution. 

Note that for y = a we recover the power utility. Duffie and Epstein [173] 
prove that: 


e First, the Bellman principle can still be applied. 


e Second, it is sufficient to substitute the term f (C,U;,-) for the instan- 
taneous utility function U (C)in Equation (6.43). Then, we have to use 
a new functional ®, defined by: 


d(C, w, V, S,t) = f (C, Upe) + L(I). (6.87) 


Example 6.6 

In the recursive utily framework, Campbell and Viceira [102] propose to ex- 
amine the impact of a mean-reverting interest-rate. They assume that the 
investor can only choose between cash and a long-term real bond. They also 
assume that the instantaneous riskless interest rate r; follows an Ornstein- 
Uhlenbeck process given by: 


dr; = ar(br — ri)dt — ord Wry, (6.88) 


where ar, br, and o, are positive constants, and W, is a standard Brownian 
motion. The market price of interest rate risk is assumed to be constant (see 
Vasicek [498]). 
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(1) Cash dynamics: 
dC; 
— = rdt. 6.89 
C; Tt (6.89) 
(2) The zero-coupon bond B with maturity T is solution of the following 
SDE: 
dB(re, T— t) 


Be, Ton = e+ OP =E= (6.90) 


where or_+ denotes the volatility at time t of the zero-coupon bond: 


(1 —a,(T-t) 
Or = OEE, (6.91) 


ar 
which is decreasing with time. 
Denote by A, the risk premium on the bond. Set 
OT — t) = \,oT-+. 
Using the Bellman equation associated to the functional ©, in Equation (6.87) 


with y = 1, Campbell and Viceira guess that the solution has the following 
form: 














ye 
Ure(Vi, rt) = (re) + (6.92) 
1-¥ 
This leads to the following ordinary differential equation (ODE): 
A2 o2 (13I\? 
= log (I lea Bele Sr [ZOA 
0 i tog (1) + (81088 Bitr) (FE) 
ar Aror \ 1 OL o 1821 
by —r) — Z +—_*____.. 
+(e a y Erare (6.95) 


This equation has an exact solution. There exist two constants Ag and A, 
such that: 


I(r:) = exp [Ao + Air]. 


The consumption-wealth ratio is constant equal to 8, and the fraction of 
wealth a; invested on the bond is given by: 


1 Ar y (1 E) Or 
ra A ME — — ————— 
i Y OT-t Y) OT- (ar + p) 
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6.3 Further reading 


Optimization theory and its applications based on ordinary equations are 
studied in Cesari [110]. Multiperiod optimization based on discrete-time Bell- 
man principle is examined in Bertsekas ([63], [64]). Portfolio optimization in 
the binomial model is solved in Bajeux [37]. 


Since this chapter provides only a brief overview on stochastic control the- 
ory, it refers to El Karoui [188] for a general and sound stochastic control 
analysis. Results about controlled diffusion process can be found in Krylov 
[339], Davis [150], and Dempster [158]. 


Optimization dealing with Markov models are detailed in [150]. Hamilton- 
Jacobi equations can be studied by using viscosity notion, as shown for ex- 
ample by Shreve and Soner [470], and Zariphopoulou [510]. 


Dynamic allocation problems are also examined in El Karoui and Karatzas 
[189]. Recursive portfolio analysis is provided in El Karoui, Peng and Quenez 
[195], using backward stochastic differential equations. 


Portfolio optimization in a lognormal market is examined for a power utility 
in Dexter et al. [166]. 


Portfolio optimization when assets are discontinuous is examined in Jean- 
blanc and Pontier [298], and in Shirakawa [468]. 


Numerical methods can be found in Kushner and Dupuis [341], Fitzpatrick 
and Fleming [229], and Rogers and Talay [429]. 


Monte Carlo methods for portfolio analysis are used in Cvitanic et al. [136], 
and in Detemple et al. [164]. 


Chapter 7 


Optimal payoff profiles and 
long-term management 


This chapter deals with two main applications of some of the results obtained 
in Chapter 6: 


e First, the optimal portfolio value at maturity is assumed to be a func- 
tion of a given set of asset prices. Therefore, the problem consists in 
determining an optimal function or “payoff profile.” 


e Second, the determination of an optimal long-term portfolio, invested 
in cash, bonds, and stocks. 


7.1 Optimal payoffs as functions of a benchmark 
7.1.1 Linear versus option-based strategy 


Assume that the investor maximizes his expected utility, but uses only a 
buy-and-hold strategy w = (we,o,wgs,o) to invest in a riskless asset B with 
rate r, and a risky asset S. Then, any solution of the optimization problem: 














Maz,Ep[U(Vr)| with Vo = e~"7 EgVr], 














is necessarily linear w.r.t. the risky asset: 


Br Sr 
Vr = Vo | wg o= +wso— }.- 
T 0 ( B,0 Bo S,0 So 
However, he may search for nonlinear solutions, including, for example, 
derivatives in his portfolio. 
To determine his optimal choice, V7, he can search this solution assuming 
that Vr = h(Sr). Then, the portfolio optimization is now given by: 


MazpEp[U (h(Sr))] with Vo = e~" Eg[h(Sr)], 


where P denotes the historical probability, and Q is the risk-neutral probability 
which is used to price options. According to Brennan and Solanki [88] (see 
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also Carr and Madan [106]), we deduce: 
Vr = J (Ag), (7.1) 


where g is the pdf of dQ/dP w.r.t. P, A is the Lagrange parameter corre- 
sponding to the budget constraint, and J(x) = (U’)~1(a). 


7.1.1.1 The general model 


Suppose, as in [415], that three basic financial assets are available: 
e The cash associated to a discount factor N. 
e The bond B. 


e The stock or a financial index S. The investor is supposed to determine 
an optimal payoff h which is a function defined on all possible values of 
the assets (N, B, S) at maturity. 


REMARK 7.1 [If the market is complete, this payoff can be achieved 
by the investor. The market can be complete, for example, as shown in the 
previous chapter if: 


e The financial market evolves in continuous time and all options can be 
dynamically replicated by a perfect hedging strategy; 


e Or, if, in one period setting, European options of all strikes are available 
on the financial market. In this setting, the inability to continuously 
trade potentially induces investment in cash, asset B, asset S and all 
European options with underlying assets B and S (if cash and bond are 
non stochastic, only European options on S are required). 


The market can be also incomplete. In that case, the solution given in this 
section is only “theoretical,” but still interesting to know since the optimal 
payoff can be approximated by investing on traded assets. (In practice, the 
investor defines an approximation method, which may take transaction costs 
or liquidity problems into account.) | 


The investor is assumed to be a pricetaker. For example, if his benchmark 
S is the SP&500 then his investment is too weak to modify the index value. 


Under the standard condition of no-arbitrage, the assets prices are calcu- 
lated under risk neutral probabilities. If markets exist for out-of-the-money 
European puts and calls of all strikes, then it implies the existence of a unique 
risk-neutral probability that may be identified from option prices. Otherwise, 
if there is no continuous trading, generally the market is incomplete and one 


Optimal payoff profiles and long-term management 209 


particular risk-neutral probability Q is used to price the options. It is also 
possible that stock prices change continuously, but the market may be still 
dynamically incomplete. Again, it is assumed that one risk-neutral probabil- 
ity is selected. Assume that prices are determined under measure Q. Denote 
by a the Radon-Nikodym derivative of Q with respect to the historical prob- 
ability P. Denote by Nr the discount factor and by Mr the product Nr. 


Due to the no-arbitrage condition, the budget constraint corresponds to the 
following relation: 














Vo = Eg{h(Nr, Br, Sr)Nr| = Ep[h(Nr, Br, Sr)Mr]. 














The investor has to solve the following optimization problem: 














Mazyp, ip[U(h(Nr, Br, Sr)| under Vo = ip[h( Nr, Br, Sr) Mr]. (7.2) 














To simplify the presentation of the main results, we suppose that the function 
h fulfils: 


| h?(n, b, 8)P(N7p,Br,S7) (dn, db, ds) < œ . 
IRt+3 


This means that h € L?(R*°, Px,.(dx)), where Xr = (Nr, Br, Sr), which is 
the set of the measurable functions with squares that are integrable on R+’ 
with respect to the distribution Px, (dx). 


The utility function U is associated with a new functional ®y, which is 
defined on the space L? (R°, Px,.(dx)) by: 








For any Y € L?UR**,Px,(dx)), ®y(Y) = Ep, [U(Y)]. 











®y is usually called the Nemitski functional associated with U (see Ekeland 
and Turnbull [187] for definition and basic properties). 


PROPOSITION 7.1 

Introduce the conditional expectation of Mr under the a-algebra generated by 
(Nr, Br, Sr). Denote its pdf by g. Assume that g is a function defined on 
the set of the values of Xr = (Nr, Br, Sr), and g € L?(IR™?,Px,). Then, 
the optimization problem is reduced to: 


Matperscn9.eey) | [U(a(0))]P xx (de) (7.3) 


IR*3 


under: Vo = he h(a)g(a)Px, (dx). 


We deduce that the optimal payoff h° is given by: 
ne = J(9), (7.4) 
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where A is the scalar Lagrange multiplier such that 
w= f JOg(e))g(@)Pxe (dr). 
IRt38 


PROOF From the properties of the utility function U, the Nemitski func- 
tional ®y is concave and differentiable (the Gateaux-derivative exists) on 
L?URt?,Px,,). Additionally, the budget constraint is a linear function of h. 
So there exists exactly one solution hê. 
h® is the solution of ge = 0 where the Lagrangian L is defined by: 


Lad) =f IEW) Prete) +A (W f, Maaa Pxelda)) 


+3 


where A is the Lagrange multiplier associated to the budget constraint. 
So, h° satisfies: U’(h®) = Ag. Therefore, hê = J(Xg). 


Suppose for example that cash and bonds are not stochastic. Then, the 
properties of the optimal payoff h* as a function of the benchmark S can be 
analyzed. Since the utility function U is concave, the marginal utility U’ is 
decreasing, then J also is decreasing, from which we deduce: 


COROLLARY 7.1 

h* is an increasing function of the benchmark Sr iff the conditional expec- 
tation g of a under the o-algebra generated by Sr is a decreasing function 
of Sr. More precisely: assume that g is differentiable. From the optimality 
conditions, the derivative of the optimal payoff is given by: 


Note that, in most cases, g is decreasing. 





Introduce the tolerance of risk T,(h(s)) equal to the inverse of the absolute 
risk-aversion: 
U'(h(s)) 
To(h(s)) = -—~——~. 
( (s)) U" (h(s)) 


As it can be seen, h’(s) depends on the tolerance of risk. The design of the 


optimal payoff can also be specified. Denote Y(s) = -29. 





Differentiating twice with respect to s, and from the previous corollary, we 
deduce: 
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COROLLARY 7.2 
Assume that g is twice-differentiable. 


Then: 
Y (s) 


Y (s)? 
Therefore, usually, the higher the tolerance of risk, the higher h” (s). 


h” (s) = [X"(A(s)) + ] x [X(h(s))¥°(s)]. (7.5) 


Example 7.1 
In what follows, the optimal portfolio profile is examined for a special case. 


e The utility function is CRRA: 
ge 
U(x) = —,0<a<l, 
a 
J(2) = (U' (2) =o. 
e The stock price S is a geometric Brownian motion. At each time t, its 


logarithm has a Gaussian probability distribution with mean (u E 407) t 
and variance o7t. 


The optimal payoff profile is given by: 
Yor! 1 


~ afer ieoas 


where s denotes all possible values of Sr, and fp is the pdf of Sr: 


h* (s) 


i In(S/S0)- (u- 40?) 2 


Liz>0} 2 o 
1(S, u, 0)=——e 
(5) 4,0) Sovy 2r 
Within this framework, the density g of the Radon-Nikodym derivative a 
is given by: 





g(s) = us", 
with 0 = £=, A=—}0°T + £ (u — $07) T, n = £ and y = ef (Sp)*. 


oO 





Therefore h* (s) can be written as a power function of the stock S: 
h* (s) =d.s™, 
with 
k 
l-a 





rT 
d Vor abt > O and m = >0. 
8 


So 9(s)=™ fels) 

Note that h* (s) is an increasing function of s. Its profile only depends on 
the comparison between the relative risk aversion 1 — a and the ratio k, which 
looks like a Sharpe ratio, depending only on u, r, and a: 
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e h* (s) is concave if k < 1- q. 
This means that, for a bearish market, the investor wants to receive 
higher payoff than the stock value. However, he will get a smaller payoff 
if the financial market will be bullish. 

e h* (s) is linear if k = 1 — a. 
The investor buys the stock S' itself. 

e h* (s) is convex if k > 1l—-a. 
The investor has a higher risk exposure in order to benefit better from 


a bullish market. He is less protected against a bearish market. 


The following figure illustrates the concavity/convexity of the optimal port- 
folio profiles according to the risk-tolerance: 


Payoff Profile 








Convex Profile 


Concave Profile 


Stock Value 


FIGURE 7.1: Optimal portfolio profiles 


For a fixed relative risk aversion a, the influence of the market parameters 
can be examined: for example, the instantaneous expected return u which is a 
fundamental parameter when dealing with portfolio management. As shown 
in next figure, the higher the parameter u, the more convex the optimal 
portfolio payoff. Note also that, for “extreme” values of u smaller than the 
riskless rate r, the optimal profile is a decreasing function of the stock price 
at maturity. | 
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Wo = So = 100.7 = 2,r=3%,o=20% .a=01 


Portfolio Profile 





o i 20 40 co 00 100 120 140 160 100 200 
Risky asset value S 


FIGURE 7.2: Optimal portfolio profiles according to stock return 


REMARK 7.2 For an HARA utility function: 


U(n)=a(b+2) “o<e<1 


I(x) = (U'(z)) =e (aa) = J 


Then, the optimal portfolio profile has the following form: 


Viz = h*(Sp), with h* (s) = c (5) = J 


Consequently, the optimal portfolio profile is a linear function of a power of 
the stock value. 

e This power is equal to 4. Note that c is the asymptotic relative risk 

aversion (c = limy cost). Therefore, the discussion about concav- 

ity /convexity is the same as the previous one when dealing with CRRA 


utility functions. 


e The term —(bc) corresponds to a fixed guaranteed amount if it is posi- 
tive, or a maximal loss if it is negative. 
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7.2 Application to long-term management 


Consider an investor who has a specific goal such as retirement, paying for 
his children’s education, etc. He is facing a variety of decisions: 


Which amount of money should he invest initially? How should he invest 
among assets: cash, stock index, and bond funds? Should he use a market 
timing or fixed strategy? For example, Dybvig [182] has shown that the cost 
of undiversified strategies over time can be substantial. 


As mentioned in Chapter 1, a well-known property of an optimal portfolio is 
the mutual fund separation theorem: a rational investor divides his investment 
between two assets, a riskless one and a risky mutual fund, the composition 
of which is the same, whatever the investor’s risk aversion. 

As noted by Canner, Mankiw, and Weil [104], popular investment advice 
does not conform to this property. Empirical studies show that allocations 
between stocks, bonds and cash depend indeed on risk aversion. In particular, 
the ratios of bond/stock differ when considering conservative, moderate, or 
aggressive investors. For example, this ratio is 1.5 for a conservative investor, 
1.00 for a moderate one, and 0.5 for an aggressive one. 

Bajeux-Besnainou, Jordan, and Portait [38] address this inconsistency issue 
between mutual fund property and popular advice. They consider that the 
investor’s horizon exceeds the maturity of the cash asset. They introduce 
a continuous-time portfolio rebalancing. In that case, cash may be a money 
market security with a short maturity (one to six months), and may no longer 
be the common riskless asset in the standard theory. In particular, this is 
true when dealing with long-term investments. Nevertheless, the investor can 
synthesize a riskless asset (for example a zero-coupon bond maturing at the 
horizon) by using a bond fund and cash. Consequently, bonds appear in 
both synthetic riskless asset and in the risky mutual fund, which can justify 
a bond/stock ratio varying with risk aversion for any HARA investor. 


7.2.1 Assets dynamics and optimal portfolios 


We adopt the same framework. This is a generalization of the Black and 
Scholes model and a variant of the Merton (1971) one state variable model. 
The model assumes normality of log returns, which is not a restrictive as- 
sumption when dealing with long term investment. In particular, this allows 
us to provide explicit formulas, which greatly simplifies the computation of 
utility and monetary losses. Obviously, other financial market models can be 
introduced and examined, as seen in Chapter 8. 
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7.2.1.1 The financial market 


The market is assumed to be arbitrage-free and without friction. Financial 
transactions occur in continuous-time, along a time period [0, T]. 


Three basic assets are available at any time on the market. (1) An instanta- 
neously riskless money market fund, the Cash, with a price denoted by C. (2) 
A Stock index fund with a price S. (3) A Bond fund with constant duration 
D, obtained by continuously rolling bonds throughout the investment period 
(0, T]. It is denoted by Bp, which is a zero-coupon bond with maturity (t+ D) 
at time t. 


As mentioned in [38], if inflation uncertainty is ignored, the interest rate 
risk means real estate rate risk. The omission of inflation uncertainty may 
induce some problems, when considering long horizons. Nevertheless, first 
empirical evidence indicates, for example, that the US inflation volatility is 
smaller than real interest rate volatility. Second, special long term bonds in- 
dexed on inflation are now available on some financial markets (for example, 
in the US or UK). Finally, it is possible to diversify by investing in the housing 
market. 


Thus, since D > d, the riskless asset is a zero-coupon nominal bond Br 
that matures at the investor’s horizon, and which is replicated by a dynamic 
combination of C, Bp and S. Therefore, there exists a two-fund Cass-Stiglitz 
separation with a synthetic riskless fund Br replicated by C, and Bp and a 
risky fund replicated by C, Bp, and S. 


Since continuous-time rebalancing is allowed, financial markets can be as- 
sumed to be complete by introducing two sources of risk. In fact, as shown 
in Duffie and Huang [175], such assumption of dynamic market completeness 
is allowed when contingent claims can be synthesized by continuous-time re- 
balancing. 


To illustrate the results, we assume that the instantaneous riskless interest 
rate r follows an Ornstein-Uhlenbeck process given by: 


dr, = ar(br — r,)dt — ordWry, (7.6) 


where ar, br, and or are positive constants, and W, is a standard Brownian 
motion. The market price of interest rate risk is assumed to be constant (see 
Vasicek [498]). 


The asset price dynamics are given by: 


(1) Cash: 


dC 
a = rit. (7.7) 


t 
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(2) Stock index: 





au = (r4 + Os)dt + o1dW + o2dW,. (7.8) 
t 
(3) Bond fund: 

dBpt 

Bop = (re + On)dt + ogdW,, (7.9) 


where W is another standard Brownian motion, independent of W,., and where 
01, 02, and og are positive constants. The parameter 0s is the constant risk 
premium of the stock, and pg is the risk premium of the bond fund. 
Since B is a zero-coupon bond indexed on the interest rate r, the relation 
between their volatilities is given by: 
pes —ar D 
oy or( e ) 
ar 
Denote also by or_; the volatility at time t of the zero-coupon bond ma- 
turing at time T: 
1— —a,(T-t) 
Gai GM, (7.10) 


ar 
which is decreasing with time (clearly og coincides with or_, when T—t = D). 


Note that in this model, interest rates and stock index prices are negatively 
correlated!. Furthermore, the market is complete. Therefore, there exists a 
unique risk-neutral probability Q associated to two market risk premia, and 
Ar, for which density 7 with respect to the initial probability P is given by: 


1 
me = exp[—(AW; + A-W,4) — 5” + à2)t]. 
The premia and Ay are determined from the relation: 


Os = 01A + O2Ar, 
0B = OBAr. 


In this setting, Br can be replicated using only the two assets C and Bp. 
These two assets span the bond market. As noted in [38], synthesizing Br 
requires a positive weight on Bp and a weight on C, which is negative if D < T 
and positive if D > T. This dynamic combination of fixed-income securities of 
different durations is referred to as the passive immunization (see Fong [239] 
or Fabozzi [215]). 


1The instantaneous correlation between the interest rate and the stock index is equal to 
-oro2. This assumption is not necessary to solve the optimization problems. 
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7.2.1.2 Optimal portfolios 


We recall the standard results about optimal portfolio computation. Port- 
folio weights are denoted by x¢,xg, and xg. The portfolio value at time t is 
denoted by V;. Therefore, the portfolio value V follows the dynamics: 


dV; 
7 = [ri + wg (t)Os + xp(t)Opldt + xg(t)oidW + [xg(t)o2 + xp(t)opldW,. 
t 
The investor’s preferences are described by utility function U, which embeds 
his risk aversion. He has an initial capital denoted by Vo. He is assumed to 
maximize expected utility over the time horizon T. Therefore, his optimal 
portfolio weights are the solutions of the following problem: 





Max E[U (Vr)]. 


LO;,LS,ceB 











For different utility functions, we show below how the optimal dynamic 
factors depend on the investor risk aversion. In what follows, we present the 
solutions for four standard utility functions: 


7.2.1.2.1 Logarithmic Utility 
U(x) = Log(x), x > 0, (7.11) 


so that the absolute risk aversion is —U” (x) /U’ (x) = 1 /x, and the relative 
risk aversion is constant and equal to —zU” (x) /U’ (x) = 1. In that case, 
the optimal portfolio is called the numeraire portfolio (see Long [361]), or the 
growth-optimal portfolio (see Merton [389]). Its value at maturity T, denoted 
by VS, is given by: 


Vr = WHr, 
where the numeraire portfolio Hr is 
=ï 


Hr=| — = | . (7.12) 


exp (i rads) 


The optimal weights at any time t are displayed in the next table. 

In the growth-optimal portfolio, hs and hg represent the weights of the 
stock index and the constant maturation bond, respectively. In the case con- 
sidered by [38], “this optimal portfolio is highly aggressive and thus levered 
(negative weight in cash). Increasing risk aversion implies decreasing weight 
in the growth optimal portfolio and consequently increasing weight in cash.” 
Note that here the ratio xg /xs is constant since within the risky fund, the 
weights are the same regardless the level of risk aversion. Therefore, there 
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TABLE 7.1: Optimal weights for logarithmic 
utility function 





with hp ener 


is no inconsistency with the mutual fund separation theorem. However, the 
ratio of all bonds Br to stock increases with risk aversion. 


Some properties of the numeraire portfolio: 
Recall that the numeraire portfolio Hr is equal to exp ( fo Ts ds) nr. 


Thus, since in particular ( fE rads) has a Gaussian distribution, the ra- 
tio H4 /H7 is Lognormally distributed: it is equal to exp|N:(t,T)], where 
N.(t,T) has a Gaussian distribution. 


Therefore, the expectation E [H4 /H/ ] is defined by: 















































Ae 1 
al rl = exp[E(N,)(t,T) + gVarNz)t, T)], (7.13) 
t 
where 
E(N.)(t,T) = z Sar) and Var(Nz)(t,T) = 2? VaT), (7.14) 
with sn 52 
OT-t + Ap 
Sar = (T — t) [br + (re — or aa. ta j, (7.15) 
PaT) = (T = t)x 
ga OT-t 02(T-t) 2 2 Ar Or OT-t 
e Gaon 2a OG Se aes 
(7.16) 











Denote respectively by yur) and Yar) the expectations E|H;/H7] and 
{| Ay log(Hr / Ht) /H7]. We have: 

















Par) = expl -bu r) + iven and Yar) = paT) X (Der) — Veer) - 
(7.17) 

To simplify the notations, we denote respectively E(N,)(T), Var(Nz)(T), ®r, 
Ur > YT, and wr the values of E(N.)(0, 7), Var N.)(0,T), ®o,r); Vo,7); 
~po,r), and o,r). 
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7.2.1.2.2 CRRA utility We now consider the utility function with a 
constant relative risk aversion, which generalizes the previous case. 
gi? 
U(x) = ——,7> 0, (7.18) 
aa 
so that —xU” (x) /U' (x) = y. The value at maturity T, denoted by V; RRA 
is 
CRRA _ (y)-4 (>) 

VEERA = (X77 HL, (7.19) 
where A is a Lagrange multiplier determined by the initial investment Vo. In 
this case, the optimal weights at any time t are deterministic functions of time 
(independent of the level of interest rate r). They are given in the next table. 


TABLE 7.2: Optimal weights for 
CRRA utility function 





Furthermore, the optimal bond over stock ratio is also time-dependent: 


zp(t) , hg 
as(t) hs 


Thus, whatever the time t and for any value of the parameters, this ratio is 
increasing in investor risk aversion. This property is consistent with current 
practice. Moreover, if y is larger than 1, the weight in cash, xc, is increasing 
in t, since it is a decreasing function of or_; (see Equation (7.10)). 


l or 





+(7-1) 





hs og ` 


7.2.1.2.3 HARA utility The HARA utility function has a hyperbolic ab- 
solute risk aversion. It includes the CRRA utility function (described above), 
and the quadratic utility function as special cases. It is given by: 


wo (EYEE) om 


where y and x* are two parameters that cannot both be negative. We also 
require that x*Br(0) — Vo < 0. We have —U” (x) /U' (x) = y /(x — x*) and 
—2U" (x) /U' (x) = ya /(a — 2*). Two cases arise according to the sign of 
y. If y > 0, the absolute risk aversion is decreasing in x. In this case, the 
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amount x* represents a required lower bound for the terminal portfolio value 
(x > «*). The initial investment Vo is at least equal to the discounted value 
of this lower bound (a* Br(0) < Vo). If y < 0, the absolute risk aversion is 
decreasing in x. The amount x* is a required upper bound for the terminal 
portfolio value (x < a*). The initial investment Vo is at most equal to the 
discounted value of this lower bound («* Br(0) > Vo). 

In both cases, the risk aversion is an increasing function of the absolute 
value of y. Note that the quadratic case is obtained with y = —1 and x* > 0. 


The optimal portfolio at maturity T, V##4"4, is given by: 


VARA _ (,,)-4* H 4 9 

T = (u) Y Hp 4+2*. 

This expression can be interpreted as follows: the optimal portfolio is a com- 
bination of a CRRA fund with y parameter and a zero coupon bond yielding 
x* at time T. The optimal portfolio weights for the HARA utility function 
are functions of w; = 1 — Br_,(t)a* /V HARA. 


TABLE 7.3: Optimal weights for 
HARA utility function 





In these expressions, Br_;(t) represents the value at time t of a zero-coupon 
bond maturing at time T. The factor w+ represents the proportion at time t 
of the risky fund in the total portfolio value. Therefore, the ratio Bond over 
Stock is no longer deterministic. In this case, the optimal weights depend on 
market conditions (median, bullish, or bearish markets). 


7.2.2 Exponential utility 
Finally, we consider the exponential utility function: 


—axr 


Ulz) =-——, a>0, 





so that the absolute risk aversion is constant, —U” (x) /U’ (x) = a. The 
optimal portfolio value for the exponential utility function, V;-"”, is a function 
of the numeraire portfolio (see Equation (7.12)): 


ex 1 
VE? = A(Vo) + z los(Hr): 
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where 














Vo = + z[log(Hr)/Hr] 
E[1/Hr] l 
The computations of the expressions for the optimal weights are more in- 
volved: 


A(Vo) = 


e To compute the replicating strategies for a given optimal portfolio, first 
note that V;/H; is a martingale. Thus, we obtain the martingality 
relation 














V/H; = Eps [Vr/Hr]. (7.21) 


e Therefore, it is necessary to compute conditional expectations of quan- 
tities which are functions of Vr/Hr. For this purpose, the conditional 
expectations of the numeraire portfolio have to be used. 


Using the martingality relation, the value V,"? is given by: 


[Ae glean , 





exp _ N] 
V = HŅE; 











Then, we obtain the expression of the optimal portfolio value at any 
time t of the investment period: 


ver = [Aet + etro] + Emet) 02) 


e To compute the optimal weights, we have to determine Veer and to 
t 
search zs, £g, and x, such that 


aver? aS | Boa | aC 
Veer ~ PT z Bp E Cr 





e For this purpose, applying Ito’s formula to Equation (7.22), we obtain: 





ey o 10 1 dy 
dV, P = jawo) Sle rt) + ain) t= += In (H) = At P t, ro)| dt 
A Op 1 Ay Op 
+I |A) ga tre) + arz +- In( H+) zld 
ð 10 
a javo Er) i a re) + E mar) Etro] de 
1 dH, 1 | 
+-y(t,r. d< H, H >|, 
oltre) | H, 2H? aia 
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where < H;, H > is the “instantaneous variance” of H;. Define y(t, r+) 
by: 


0 10 
x(t, rt) = A(Vo) P Ct, ri) t= Y 
Or a Or 


Then, dV,” has the form: 


(t, ri) + L n(A) P tra). 











1 y(t dS. dB t 
12E hot + he Dity y UBT) a 4 (Jat 
t 


dvr = Ver 
t t | St Bps VE 


Using this property, it is possible to determine the optimal weights by 
identifying the martingale components. This yields to the following 


relations: i | 
p t, Tt 
Uso, = a ( ver ) hsa, 


TSO2 + £tBOB = 


1 /p(t,r:) 1 /p(t,r:) x(t, rt) 
= h = h —Or), 
r ( V ) goo + F ( V BOB + V (—or) 


from which we can compute the optimal weights for the exponential 
utility function. 














e Finally, we deduce the optimal portfolio weights for the CARA utility 
function, 


TABLE 7.4: Optimal weights for CARA utility 


function 











xs = Ł (<u) hs 
on = ($) (hs + he) — (3e) 
where 
1 OT aVo — Y% 
XT) = (=) leu) ( - ‘)) (1 + Wer) — Par a = in(H)) . 


Note that these optimal weights are deterministic functions of the instanta- 
neous interest rate r+, and the portfolio value V$”. These functions depend 
only on market parameters. The ratio Bond over Stock is stochastic. 
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7.2.3 Sensitivity analysis 


We consider a numerical base case. For simplicity, from now on we focus our 
attention on the CRRA utility function (the other cases can be treated in a 
similar way). For the base case, for the values of the short rate parameters, we 
use the estimates of Chan [114]: speed of convergence, a, = 12%, asymptotic 
short rate value, b, = 4%, and volatility, oy = 4% (see Equation (7.6)). The 
current instantaneous interest rate is ro = 4%. The risk premia are 6g = 1.5%, 
and s = 6% ; the index stock volatilities are o} = 19%, and o2 = 6% 
(see Equations (7.8) and (7.9)). Finally, the Bond fund constant duration is 
D = 10 years. 


TABLE 7.5: Asset allocation sensitivities for CRRA utility 


Market parameters CASH BONDS STOCKS Bonds 


and investor’s typey Stocks 
Decrease of the volatility 
of Interest rate % % % 
from 4 to 2 
3.52 -13 70 43 1.63 
5.28 -8 80 28 2.86 
10.57 -2 88 14 6.28 
Decrease of the speed 
of convergence ar 
from 12 to 10 
3.52 -10 65 45 1.44 
5.28 -5 75 30 2.50 
10.57 0 85 15 5.66 
Change in the value 
of b,from 4 to 2 
3.52 -10.5 65.5 45 1.45 
5.28 -6 76 30 2.53 
10.57 -2 87 15 5.80 


A decreasing of the volatility of the interest rate shifts money from stock to 
cash and bonds (the instantaneous bond return is less risky). Decreasing the 
speed of convergence a, is similar to increasing interest rate volatility, since 
interest rates converge slowly to their long run value b,. Decreasing the value 
of b, has almost no impact for the CRRA case, since portfolio weights are 
independent of the interest rate level. 
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7.2.3.1 Utility of the optimum portfolio 


Consider an investor with a HARA utility function, with relative risk aver- 
sion y. His time horizon is T. In this case, the discounted expected utility 
(computed at time t = 0) is given by: 














BOW i W0) = (FE) ("aryl (7.23) 














exp h ( o(Nz)(T) + sVartn.)(7)) | , (7.24) 


and, for the special case CRRA, we obtain: 




















iS) 








F 
UV mi W = (FEL) ee [r (EWT) + SV arn o) 
(7.2 
with z = (1 — y) /y, where Vọ is the initial investment, and where w 
and Var(N,)(T) are defined by relation (7.14). 
Consider now an investor with a CARA utility function, with absolute risk 
aversion a. We have: 











(7.26) 





IVa Yall = ep [E] 


OT 


Under the same numerical assumptions of the previous numerical base case, 
consider a financial institution that offers to its clients three standardized 
portfolios: the first one is aggressive (45% Stock), the second one is a moderate 
portfolio (30% Stock), and the third is a conservative portfolio (15% Stock). 
Assuming CRRA utility functions, it is possible to recover two unknowns: 
the values of the risk aversion and of the investment period corresponding to 
this portfolio. For this example, we find T = 25 years, and we provide below 
values of the risk aversion which best fit these three portfolios. 


TABLE 7.6: Asset allocations for agressive, moderate and 
conservative investors (CRRA utility function) 


Ratio of Investor 

CASH % BONDS % STOCKS % Bonds to Stees type + 
-10 65 45 1.44 Aggressive 3.52 
-6 76 30 2.53 Moderate 5.28 


-2 87 15 5.8 Conservative 10.57 
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7.2.4 Distribution of the optimal portfolio return 


The performance of an optimal strategy can be illustrated by the distribu- 
tion of return at maturity. Below, this distribution is computed for the CRRA 
utility function (the other cases can be derived in a similar way). 

From Equation (7.19), we deduce the optimal portfolio return VC PRA /Vo: 














R(y) = Hp! JEHO]. 


Since H, Or Y has a lognormal distribution, the cumulative distribution func- 
tion Frey) (x) of the return R(y) can be computed explicitly. Let Yr denote 
the random variable such that Hr = exp[Yr]. The distribution of Yr is Gaus- 
sian. Denote respectively by mr and sr its expectation and standard devia- 
tion. 

The expectation E ae »] is defined by: 






































1 
: ae a = exp fewa IF ZV Nann) : 


with 











A(N) = z ®p and Var(N,) = 2?Vr. (7.27) 





Then, we have: 














HE yLog [x E eee -mr 
Fry) (x) = P —— nyír =P |X < Aa 
a [HS] oT 


where X is a random variable with a standard Gaussian distribution. Note 
that mr and sr are constant, which only depend on market parameters (but 
not on risk aversion 7). 

The inverse cumulative distribution functions are displayed in the next fig- 
ure in the base case for three values of y (y = 3.52, y = 5.28, and y = 10.57) 
and for a time horizon T = 20 years. 

At a given probability level 0.98, an investor with a aggressive portfolio 
has a guarantee to recover only 70% of his initial investment. However, this 
investor has a 20% chance to multiply his initial investment by a factor 5. If 
this investor selects a conservative portfolio, he has a guarantee (up to 98%) 
to increase by half the value of his initial portfolio value. However, in this 
case, the probability to multiply his initial investment by 5 is almost zero 
(the probability to multiply the initial investment by 3 is only about 10%). 
Finally, the investor selecting the moderate portfolio (intermediary curve) has 
a 98% probability to recover his initial investment after 20 years. 
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FIGURE 7.3: Inverse cumulative distribution of the return at maturity 


7.3 Further reading 


Long-term management depends on a large variety of factors. These in- 
clude: income, saving capacity, age, gender, know-how, experience, liquidity, 
real estate, investment horizon, attitudinal and personality factors, investment 
objectives (e.g. planned projects or retirements), initial investable assets and 
additional funds. Campbell and Viceira [103] discuss to what extent life-cycle 
portfolio choice and saving are affected by the variables mentioned above. 


The influence of risk aversion has been examined by Kallberg and Ziemba 
[315]; the optimal portfolio is more sensitive to the value of risk aversion than 
to the functional form of utility function. 


Similarly, Brennan and Xia [89] highlight the importance of considering in- 
vestors’ time horizons in the analysis of optimal portfolio policies. 


Long-term portfolio optimization with fixed incomes is studied in Brennan 
and Xia [89], who examine the Bond-Stock Mix when stochastic interest rates 
are involved. Fabozzi [215] provides an overview on bond markets analysis. 
Sorensen [473] and Lioui and Poncet [357] analyze optimal portfolio choice 
and fixed income management under stochastic interest rates. 


Battocchio et al. [50] examine in particular the role of the decumulation 
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phase in the determination of the optimality of asset allocation under mortal- 
ity risk. 

Note also that financial institutions typically offer a limited number of stan- 
dardized portfolios which imperfectly match investor preferences. Jensen and 
Sorensen [302] and De Palma and Prigent [161] quantify the efficiency losses of 
an investor acquiring a standardized (e.g., conservative, balanced, and aggres- 
sive) versus a customized portfolio. A standardized portfolio may differ from 
the optimal one by its risk exposure or its time horizon. Numerical results 
show that the monetary losses from not having access to a customized port- 
folio can be substantial, in particular when the time horizon of the investor 
differs from the one selected by the financial institution. 


Chapter 8 


Optimization within specific markets 


As seen in Chapter 6, the investor’s utility maximization of his terminal wealth 
can be solved in the continuous-time setting. Using methods of stochastic 
optimal control, Merton ([385], [386]) proves that the value function is the 
solution of a non-linear partial differential equation: the Bellman equation. 
Then, closed-form solutions are available for the HARA utility. However, the 
dynamic programming method is based on Markovian assumptions. To avoid 
this hypothesis, another approach has been introduced: the duality portfolio 
characterization by using the martingale measures, the so-called risk-neutral 
measures. For complete markets, this set is reduced to one point and the 
optimal solution is determined from the fundamental result: the terminal 
wealth of the optimal portfolio is equal (up to a multiplicative constant) to 
the marginal utility inverse of the density of the martingale measure. This 
method, illustrated in Chapter 6, has been introduced by Pliska [410], Cox 
and Huang ((132],[133]), and Karatzas et al. [319]. This is in line with the 
optimal investment problem for a one-period model with a finite set of random 
events solved by introducing the Arrow-Debreu state prices. 


However: 


e The assumption of financial market completeness is very strong. Rough- 
ly speaking, it is supposed that there are as many assets as random 
sources. 


e In addition, several market “frictions” must also be taken into account: 
- Constraints, such as no-short selling; 
- Transaction costs; 
- Labor income stream; 


- Partial information about security prices. 


Thus, specific methods must be introduced to examine such optimization 
problems. 
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8.1 Optimization in incomplete markets 


He and Pearson ({288], [289]) have studied this problem both in discrete- 
time and continuous-time frameworks. Karatzas et al. [319] have shown 
how expected utility maximization can be solved by martingale methods and 
convex duality. In what follows, first a general result, due to Kramkov and 
Schachermayer [335], is presented. Then, some standard financial models are 
considered and explicit optimal solutions are detailed. 


8.1.1 General result based on martingale method 


- Price process: Consider a financial market with (d+ 1) securities, one 
riskless bond with constant rate r, and d stocks with price process (Si,t)i,t, 
t € [0,7], and 1 <i < d. Note that the time horizon can be also infinite. The 
process S is assumed to be a semimartingale on a filtered probability space 
(Q,F, (Files P). 


- Portfolio strategy and value: Recall that a self-financing portfolio strategy 
(9:,4)i2, where 0; denotes the amount invested on asset i at time t, is a pre- 
dictable process, which is integrable w.r.t. the price process $. The portfolio 
value process (V;):, associated to strategy (9;2)i,2, is given by: for t € [0, T], 


d t 
WaVed > bi sdis. (8.1) 
i=1 79 


Denote by V(Vo) the family of wealth processes (V;); such that for t € [0, T], 
V; > 0 and with initial value Vo. 


- Equivalent local martingale: A probability measure Q, equivalent to P, is 
called an equivalent local martingale measure if any process V in V(1) is a 
local martingale w.r.t. Q. Note that when the process S is bounded (resp. 
locally bounded), then under an equivalent martingale measure, the process 
S is a martingale (resp. locally martingale) (see Delbaen and Schachermayer 
155] for more details about such property). Denote by M = M,(S) the set of 
equivalent local martingale measures which is assumed to be non-empty due 
to the absence of arbitrage opportunities, as shown by Harrison et al. ([284], 
285)). 








- Expected utility maximization: The investor has a utility function on 
wealth U : (0,400) — R. The investor searchs for the value function as- 
sociated to the primal problem: 











J(Vo)= sup E[U(Vr)]. (8.2) 
VréEV(Vo) 
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Assumptions (A) on utility function: The function U is defined on Rt, 
strictly increasing, strictly concave, continuously differentiable, and such that: 


U'(0 = lim, U'(x) = +00, 
x—0 
U'(œ) = lim U'(x)=0. 


The value function J is supposed to satisfy for some x > 0, J (x) < oo. 
In order to solve the optimization problem in this general framework, Kramkov 
and Schachermayer [335] introduce the following key assumption: 

- Asymptotic elasticity of the utility function: The utility function U has 
asymptotic elasticity AE(U) strictly smaller than 1 if: 


, xU' (x) 

AE(U) = lim y Te) 

Examples of this utility function are the logarithm and power utilities: for 
U(x) = Inz, AE(U) = 0, and for U(x) = © with 0 < a < 1, AE(U) = a. 
However, for instance, the utility function U(x) = == has AE(U) equal to 1. 
Kramkov and Schachermayer [335] show that the condition AE(U) < 1 
is necessary and sufficient to get the following properties under the previous 


assumption on price process S: 


<1. (8.3) 





e 1) The value function 7 is a utility function which is increasing, strictly 
concave, continuously differentiable, and such that: 


lim, J'(x) = +00 and lim J'(x) = 0. 
2—0 L—0O0 


e 2) The optimization problem has a solution. 


Note also that, either the utility function U(.) satisfies AE(U) < 1, which 
implies that AE(.7) < 1, or AE(U) = 1 in which case there exists an R-valued 
price process S which is continuous, induces a complete market, and is such 
that J is not strictly concave (more precisely, there exists xo such that J (x) 
is a straightline with slope one for x > xo). 

- Legendre-transform and conjugate utility function: As seen in Chapter 6, 
it is useful to introduce the conjugate function of the utility function U: 


O(y) = max [U (a) — ay], y > 0. (8.4) 


The function U is the Legendre-transform of the function —U(—x) (see 
Rockafellar [426]). If the utility function U satisfies assumption (A), the 
function U is continuously differentiable, decreasing, strictly convex and such 
that: 


lim U(x) = lim U(x) and lim U(x) = lim, U(x), (8.5) 
xz2—0 ®L—- Oo L—- Co t=O 
lim, U"(x) = —oo and lim U'(x) = 0. (8.6) 
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Furthermore, we have the following bidual property: 


U(x) = max w) + xy] ,z>0. (8.7) 
y>0 
The derivative of U (.) is the inverse function of the negative of the derivative 
of U(.). Denote it by J. We then have: J = —U’ = (U’)"?. 


Example 8.1 
Consider the power utility U(x) = = with 0 < a < 1. Then, 


~ l-a a 
U = Ta, 
(y) zY 





Consider the exponential utility U(x) = —<— with 0 < a. Then, 





nN 


O(y) = = (Iny—1). 
l 


- Optimal solutions: We refer to [335] for the three following theorems and 
their detailed proofs, corresponding first to the complete case, and second to 
the incomplete case with AE(U) < 1 or AE(U) =1. 


THEOREM 8.1 Complete case 


Suppose that previous assumptions are satisfied, and also that M = {Q}. 
Denote: Q 
Ty) =E (2 b) (8.8) 
We have: 


(i) The functions J and T are finite: 














I(x) < œ,Yx > 0 and T (y) < œ, Vy > 0 sufficiently large. 


Denote yo = inf {y i T (y) < oo). The function T (y) is continuously differ- 


entiable and strictly conver on |yo,co[. Denote zo = limy—y, (-F'@)). The 
function J is continuously differentiable on ]0,co[ and strictly concave on 
]0, zol. The value functions J and J are conjugate: 


nay 


Jy) = max [I (x) — zy], y > 0, (8.9) 
I(x) = max [Fl + zy] , £ >00, (8.10) 
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and their derivatives satisfy: 


lim J'(x)=œ and lim F'(y) =0. 


x—0+t yoo 


(ii) If x < zo, the optimal solution V7 (x) is given by: 
x dQ 
Vr(z) =I OS) , for y < yo, 


where x and y satisfy y = J'(x), or equivalently x = — T' (y). Note that, in 
that case, the optimal solution process V*(x) is a uniformly integrable mar- 
tingale under the risk-neutral probability Q. 

(iii) For 0O < x < xo and for y > yo, we have: 


Moie) , J'y) = | So" OT) i 


























se) =8| 
In order to study the incomplete case, we introduce: 


e The family V(y) of non-negative semimartingales with Yo = y and such 
that, for any X € V(1), the product XY is asupermartingale. Note that 
this set contains the density processes of all equivalent local martingales 


Q. 


e The value function of the dual optimization problem is denoted and 
defined by: 











Fy) = inf E (© [Yr ) l (8.11) 

















For the complete case, we have also J (y) = E (© W) . For the in- 
complete case, they may differ. 


Since the functions J and <I are concave, the right-continuous versions 
of their derivatives J'and — J’ exist. Recall that the asymptotic elasticity 
AE(J) is defined by: 


ABT) aap C2 


roo F(a) 
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The following two theorems (see [335]) provide general results for the in- 
complete case. 


THEOREM 8.2 Incomplete case with general utility function U 


Under previous assumptions, we have: 
(i) The function J has finite values, J (x) < co, Vx > 0, and the function 


J is finite Vy > yo, yo sufficiently large. Denote yo = inf {y ; T(y) & oo}. 
The function J is continuously differentiable on |0, col, and the function T(y) 
is strictly convex on ]0,co|. The value functions J and J are conjugate: 


Ty) = max [I (x) — zy], y > 0, (8.12) 
F(a) = max [ F(x) + ay], © >0. (8.13) 


and their derivatives satisfy: 


lim J'(x) =co and lim J'(y) =0, (8.14) 
yoo 


a—0t 


(ii) If T(y) < œ, then the optimal solution Vii(x) exists and is unique. 


THEOREM 8.3 Incomplete case with AE(U) < 1 


Under previous assumptions and the additional hypothesis AE(U) < 1, we 
have: 

i) The function J is finite: T(y) < œ,VYy > 0 sufficiently large. The 
functions J and J are continuously differentiable on ]0,œ0|. The functions 
J! and -J' are strictly decreasing and satisfy 


lim J'(x) =0 and lim, —F'(y) = +00. (8.15) 
xL—CO yo 


The asymptotic elasticity AE(J) is also smaller than 1 and: (notation 
xt = max(z,0)) 
AE(J)* < AE(U)* <1. (8.16) 
(ii) The optimal solution V} exists and is unique. If Y*(y) € V(y) is 
the optimal solution of problem (8.11), where y = J'(x), we have the dual 
relation: 
Vr (a) = J (¥r(y))- (8.17) 
The process V;(x)Yp(y) is a uniformly integrable martingale on [0, T]. 
(iti) The relations between J, J and Vj, Y#. are: 


Ye (y)U" (YZ) 
y 


I(t) =E Vir (a)U" H : F'(y) SE 


T 


























(8.18) 
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(iv) The value function J is given by: 


Ty) = inf E ls OE) (8.19) 


QEM 














Example 8.2 One-period model 
Consider a one-period model where the investor has a logarithmic utility 
function U(x) = ln x. Thus, we have: 





z? J(y)=4 
AE(U) = li =0<land ? a ene 
) search x ergs {ai =—Iny-1. 


Suppose that the financial market contains a riskless asset, taken as nu- 
meraire, and a stock S with only three return values. Set So = 1. The stock 
value at maturity T is given by: 


u (“up”) with probability p, 
Sr = 4 m (“mean”) with probability 1 — p — q, 
d (“down”) with probability q. 


with d < 1,m < u. Assume that m = 1. 
Then, a supermartingale Y in Y(y) has the following form: Yo = y and 


Yu (“up”) with probability p, 
Yr = $ Ym (“mean”) with probability 1 — p — q, 
Ya (“down”) with probability q, 


where yad, Ym, and Yu are non-negative scalars. 
The primal optimization problem is associated to the value function J: 











J(x)= sup Ef[ln Vz]. 
VrEV(x) 





The dual problem is associated to the value function T : 


nN 


Jy) 














=. ee eine i, 
sr a 


Since Y and Y.S are in Y(y), J (y) can be written as: 


J(y)=-1- sup E[-In¥y], 


YdsYmsYu 


=-1— sup [qnya+(1—q—p)lnym+pln yy], 
YdsY¥msYu 


with 














2 [Yr] = qya + (1 — q — p)Ym + Pyu < y, 
E [Yr Sr] = qdya + (1 — q — p)Yym + Puyu < y, 
Yd, Ym, Yu = 0. 
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Note that the optimum (y%, y*,, Y4) is necessarily such that the previous con- 
straints are true equalities. Then, we deduce: 








x — , (pt+q)(1—d) (p+q)(1—d) 
Yu = Y plu—dy ~ plu-d) 
m= T w Ay x Do 
Hi oe pr u— pTtq)u— 
Ya =Y u-d) qu—d) 


Finally, since VA(x) = J (Yf(y)) where y = J' (x) = 4, we deduce: 


p(u—d) 


pt+q h l-d 


q(u-d) 
(p+q)(u—1) 





U 


Another standard example is based on a d—dimensional Brownian motion 
with dimension d higher than the number of available assets. 


Example 8.3 Multidimensional Brownian motion 


Consider now a continuous-time model where the investor still has a log- 
arithmic utility function U(x) = Ina. The financial market still contains a 
riskless asset, taken as numeraire, and one stock S. Set So = 1. Assume that 
S is solution of the SDE: 


dS; = Sy. (udt + a1,dW, 4 +02dW2, ) ; (8.20) 


where W = (W1, W2) is a 2—dimensional standard Brownian motion. Thus, 
the price S is equal to: 


1 
sı = exp ( fu- 5 (2 +02) t+ oiWi,t +oaWan ) ; (8.21) 
where u, c1, and og are constant with a, > 0 and o2 > 0. 


Due to the predictable representation theorem, the process Y has the fol- 
lowing form: 


t 1 t 
Y; = Yo exp (/ [a (s) - 5 (o? +68) as+ | oY ,dWin.+ f 
0 0 0 


The processes Y and Y.S are in Y(y). Therefore, for any T, 


t 


o3 sdW2,s ) . 
(8.22) 














at 
2 [Yr |Yo = y] = y exp (/ wia) <y, 
0 














T 
2 [Yr Sr |Yo = y] = y exp (/ [u+ u (t) + o10h 5 + 7203 +] i) <y. 
0 
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Thus uY (.) = 0. Then, we have: 














5 a Y \? 
mv) =Iny 5 f [eX + (03) | at. 


Consequently, 
w+ 1014 + 02034 =0. 


Then, 


1 T 
(In Vr) = lny — J G + 2p0 (034) + GAR (o + 3) | / (o1)° dt. 














Finally, we deduce that the optimal solution is given by: 


S a 
(of +03)’ > (o+ o3) 


P u T 
yx = — | ip — o Wao 
T = yexp E a2) ( gl O1W1,T —02 ar) , 


Y es 
Oit 7 


2 

> Se ay u 
= —1 — Bin) = -1 -l n 
I (y) [ln Y7] DUA eae 














Since y = J'(x) = +, we deduce 


Ea 


. T 
Vr = Vo exp otis (Futon oaWar )| . 


REMARK 8.1 The optimal portfolio value V7 is an increasing function 
of the stock price at maturity. Indeed, using the relation: 


1 
ST exp (- Ç -3 (o F o3)| r) = exp (o1W1 t +02W2,T ), 


we deduce: ` 
o2+o2 
Vp =V xur x Spt ?’, 


where vr is a deterministic function given by: 


2 
KT H 1,» 2 
= a ea eee eee T\\. 
UL exp | cs =| ox | a2) (u 5 (01 +23) 


The concavity/convexity of V; only depends on the comparison between p 
and of + 03. 
U 
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8.1.2 Dynamic programming and viscosity solutions 


The standard dynamic approach can be extended to portfolio analysis with- 
in incomplete markets. The value function can be characterized as a viscosity 
solution of the associated Bellman equation. In general, the value function is 
not smooth. However, as shown by Pham [408], for CRRA utility functions, 
the non-linearity of the Bellman equation can be reduced by using a logarith- 
mic transformation, as introduced by Fleming [230] in a stochastic control 
setting. This allows us to get a semilinear equation with quadratic growth on 
the derivative term. 


8.1.2.1 Stochastic volatility 


In what follows, the results of Pham [408] are presented (see also Za- 
riphopoulou [510] for the univariate case). 


- The financial market contains a riskless asset with rate r. The price S of 
the d risky assets is assumed to be a semimartingale on a filtered probability 
space (Q, F, (F:)+, P) , and to be the solution of the SDE: 


dS; = diag( S+). [(u (F) dt +o, (F;) dW ,1 +02 (F;) dW2,1 )] 7 (8.23) 


where W; is a d—dimensional standard Brownian motion, and W2 is an m— 
dimensional standard Brownian motion, independent of W1. 


Diag(S;) is the diagonal matrix d x d matrix with Mii = Sit. 


The function g; (.) is R@-valued, and a9 (.) is a d x m matrix. Both aj (.) 
and a2 (.) are assumed to be continuous functions. 
The process F refers to stochastic factors and is defined from: 
dF, =n (Fi) dt + dW 44 5 
where the function 7 (.) is assumed to be Lipschitzian. 


Denote (y) = u(y) — rI, the excess rate of return w.r.t. the riskless rate 
(I is the vector of one in R°). 


Denote also U(y) = [a1 (y) 2 (y)] the matrix-valued volatility of the risky 
assets. The matrix © is of full rank equal to d. Denote: 


Eul]? 
aie EE TE 


Assume that there exists a positive constant C such that: 


HCI Ve) < CU + llyll, 
lle) / v2) < CU + Ilyl)- 
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- The investor has a weighting process (w;); which is, as usual, assumed to 
be predictable and integrable: 














sup E [exp (a [Ew] 1) < oo, for some constant a > 0. 
te[0,T] 


Denote by A the set of admissible strategies. The investor has a power 
utility U(x) = =, withO<a<1. 


The value function of the investor is defined by: for any (t,x, f) € [0,T] x 
Rt x R¢, 





Stes ) = 3 E [U (Vr) |V: =2,F, = f]. (8.24) 











- The dynamic programming equation (the Hamilton-Jacobi-Bellman equa- 
tion) associated to the stochastic control problem is the non linear PDE: 





OTF OT 4 1 
aed eet n(f)DsJF + ae (8.25) 
OF 1 
max ‘wu f)aa— + 5 EG) tw|| a (F)a DfT 
=0, (8.26) 


where Du denotes the gradient vector of u w.r.t. f and Apu is the Laplacian 
of of u w.r.t. f and D?; is the second derivative vector w.r.t. (x, f). 


The terminal value is: 
I(T, T, f) ae 


Since the utility function is homogeneous, and the dynamics of the wealth 
process depends linearly on the control w, a logarithmic transformation can 
be introduced and the value function can be searched from the form: 


Tt, z, f) = exp [= v(t, f)]. 
Therefore, we have: 


aT dp, OF pbs 
Ot at Ox Ot 





=a(a—1)7, 


D+ J =-TJ Dy; Ay J =[-Ayt D'pDo] J; Dirt = -aT DK8.27) 
Thus, the function y is the solution of the following semilinear PDE: 


99 


DE say + H(f, Dy) = 0 with Y(T, f) =0, (8.28) 


240 Portfolio Optimization and Performance Analysis 


and the function H is defined by: 


H(f,p) = 


a(l 


LAIP- nG) tar + max {aw (u(f) - o(f)p) 2 - E lol. 


PROPOSITION 8.1 Equivalence of both PDE 


Assume that there exists a solution ọ to the previous semilinear PDE (which 
is supposed to be continuously differentiable w.r.t. current time t, and twice- 
continuously differentiable w.r.t.(x, f), with terminal value Y(T, f) = 0. 


Then, the value function of the first problem is given by: 


ee 
I(t, T, f) = ae exp [-y(t, DI j 
In addition, an optimal portfolio is given by the Markov control 
w = wt, Fe), 


with 


a 
2 





(IEG) tlf?) — tw (uA) - oy) Delt, £) 


s(t : 
w(t, f) € arg min | 
If there is no specific constraint on w, the optimal Markov control is given 
by: 
2 l-a t 4/2 t 
w(t, f) = = (Ew?) — tw u) -= oy) Delt, 1) 


REMARK 8.2 Ihn the case of constant coefficients, which corresponds to 
an extension of Merton’s model, the solution does not depend on y and is 
equal to: 


plt, y) = A(T =t), 
where the parameter is given by: 


A= ar + max fown) — oto) (lsP) : 


The optimal proportion is constant given by: 


w € arg min [£ 5 - [Eou]? - w'n(u)| 





However, in a general stochastic volatility model, no closed-form solution 
is available for the parabolic Cauchy problem (8.28). 
It must be solved numerically but is simpler that the initial Bellman equation. 
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Example 8.4 Hull-White and Scott models 
Assume that 


o = (r+ p(F,)) dt + pe dW, + /1— pe dW, 


t 
dF; = (a — OF;,) dt + dW2 +, 
where a and y Æ 0 are constant and @ is the rate of mean-reversion. 
The parameter p is a constant correlation coefficient. 
In that case, we have: 


n(f) =a—6f,o(f) = pe and ='d(f) = e”. 
Therefore, without constraint on weighting w, the value function of the 


optimization problem is given by 


F(t,2,f) = exp [ot f) 


where y(.,.) is the unique solution of the previous Cauchy problem: 


dp 184 1 a »\ {dy i 
See el ae af 


a pilf)\ Ip a pF) 
-(e-01 175 erf ae ef 


=0. (8.29) 














Additionally, the optimal portfolio is given by: 


ar ee a dp 
Wt = 
l-a 








T PE, r) , a.s, 0<t<T. 
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8.2 Optimization with constraints 


As seen in Chapter 3, specific constraints are often introduced, such as no 
short-selling, lower and upper bounds on portfolio weights, etc. 


In the continuous-time setting, Cvitanic and Karatzas [137] study the s- 
tochastic dynamic control problem associated to the maximization of the ex- 
pected utility of consumption and terminal value. 


Mathematically speaking, the set of constraints is supposed to be a given 
closed and convex subset of R°. As seen in what follows, the idea is to 
embed the constrained problem in a family of unsconstrained ones. Then, to 
find an element of this family which satisfies the required constraints. Such 
results cover, in particular, incompleteness and no short-selling constraints. 
Again, this approach is based on martingale theory, duality theory, and convex 
analysis. 


8.2.1 General result 


- The financial market contains a riskless asset with rate r. The price S of 
the d risky assets is assumed to be a semimartingale on a filtered probability 
space (Q, F, (F:)+, P) solution of the SDE: 


dS; = diag(S;). [u (t) dt + o (t) dW], (8.30) 


where W is a d—dimensional standard Brownian motion, and usual previous 
assumptions on coefficient functions are made. 


This financial market is complete. Thus, there exists one and only one 


risk-neutral probability Q defined as follows: 
Denote the relative risk process 7 by: 


n(t) = o (t)* [u (t) — (ON, 


T 
z | Inel a es 
0 


The exponential local martingale L: 


with 














1) =ex[-5 f nolas- f n(syanr) 


is the Radon-Nikodym density of the risk-neutral probability Q w.r.t. the 
probability P. 
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Denote R as the discount factor: 


Denote M the product: M(t) = R(t)L(t). 


- Set K of constraints: The constraints are assumed to be such that the 
set K is a non-empty, closed, and convex subset of R. 


Denote by 6 the support function of the convex set —K, defined on R?, and 
with values in R? U {+00}: 


d(x) = O(a, K) = SoD wea). 


The function 6 is a closed, positively homogeneous, proper convex function 
on R (see for example, Rockafellar [426] for details about these notions). 


The effective domain of the function 6 is the set K defined by: 





K= {x € R*,6(x) < œ}, 
= {x € R458 E€ R,— ‘war <8, vwe K}. 














The set K is a convex cone, called the “barrier cone” of —K. 


is continuous on K 


In what follows, it is assumed that the function 6(., K) 
> do (for example, if K 


and bounded below on R¢. For some do € R, ô(x, K) 
contains the origin, ĝo = 0 is convenient). 


Note also that the function ô(.) is subadditive: 
d(a +y) < d(x) + d(y). 


- Utility functions: these functions satisfy all the same assumptions as 
shown in the previous section devoted to incomplete markets. In particular, 
the conjugate function of a utility function U is still denoted by U: 


nN 


U(y) = max [U (y) — zy], y > 9, (8.31) 


and the inverse of the derivative marginal utility U’ is denoted by J. 


- Constrained optimization problem: The investor searchs to maximize 











D 





7 


T ~ 
f U(t, ci)dt + Č (Vr) 
0 
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on the set Ax (vo) of all admissible strategies (c, w) satisfying usual assump- 
tions (see Chapter 6): 


Ax(vo) = {(c, w), w(w,t) € K, for P@ dt as. (w,t)}. 


The value function Jg is defined by: 











Ik = sup 
(c,w)€EAx (vo) 





T 
| f U (t, c,)dt + U(Vr) 
0 








- When K = R? (no constraint), recall that the solution is given by relation 
(6.83): 


& = IA (r(t))) and Vp = JA*(K(T))), 
where «(t) = R(t)L(t). 
The Lagrange parameter \* is determined from the budget equation. Its 


existence is deduced from assumptions (U) and (V), since the function F, 
defined by 


T 
F(y) = Ep i K(t) I (yr(t), dt + «(T)J (ys(T))| , 














is continuous and non-increasing on [0,a] where a = inf {y|F(y) =0}. It 
satisfies also: 

lim F(y) = +; lim F(y) =0. 

y—0 y—+oo0 


Then, the function F has an inverse F7!. 
Define the function G by: 


G(y) = H(F*(y)), 


where the function H is given by: 











H(y) = Ep 





T ~ ~ 
J UOD nae + 6 [Ton] 
0 








Under assumptions (U) and (V) for utility functions U and U, there exists 
an optimal strategy (c*, w*) € A(vo) such that: 


J(c*,w*,vo) = max J(c,w,vo) (8.32) 
c,weE A(vo) 














T 
where J (c, w, vo) = Ep J U (cs, s)ds + U (Vp) (8.33) 
0 
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Using the previous function F and the inverse functions J and J of marginal 
utility functions U’ and U’, we have: 


Gj = J(F H (vo)s(t)), 


Vp = J(FO"(v0)K(T)), 


and the optimal weighting w* is deduced from the martingale representa- 
tion 6.78. 


- Auxiliary unsconstrained optimization problem: Cvitanic and Karatzas 
[137] introduce a family of unsconstrained optimization problems which em- 


beds the constrained problem. For this purpose: 


e Consider the space H of F;-progressively measurable processes (vz)¢ 
with values in Rt, and such that: 


T 
lloll? = f peP ar ae. 


e Introduce the class D of processes such that: 














p= {ver [saso (8.34) 
0 


where ô(.) is the support function of the set of constraints K. Note 
that: K 
v E€ D <> viw,t) € K, for P8 dt as. (w,t), (8.35) 


where K is the barrier cone of K. 


For any given v € D, consider a new financial market M, with one bond 
and d stocks: 


dBW = B®. [r(t) +ô (v,)] dt, (8.36) 
d 
ast) = SE). | wit) + vilt) + 8 (ve) + X oi5(t)dWie| (8.37) 


j=l 
Associated to the process v € D, denote also: 


* The relative risk process 7) by: 


g(t) = a (E) [u (t) + v(t) +5 (ve) 1 (r(t) + 8 (v)) 1] = nt) + ovit). 
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* The exponential local martingale L®): 


1 t 
L®) (t) = exp -5/ 
2 Jo 


is the Radon-Nikodym density of the risk-neutral probability Q™) w.r.t. the 
probability P. 








Zli ee i. i naw, (8.39) 


* Denote R™) the discount factor: 
R(t) = exp |- i (r(s) + 5 (vs) ds! . (8.40) 
* Denote M™®) the product: MO (t) = RO (MLO (t). 
* Consider the new set of admissible strategies: 
- The wealth process V”) satisfies: 
av) =. [re +8 (u%)] Vi — c(t) at] ay luow ` (8.41) 


where w = W, + IN n)(s)ds is a standard Brownian motion under the 
risk-neutral probability Q™). 


- The investor searchs to maximize (through the strategy (c), w))): 














T 
j f U(t, dt + OV) 
0 


' (8.42) 








on the set A) of all admissible strategies (c(), w®)): 
A (v) = {(c), w), w)(w,t) € K, for PO dt a.s. (w,t)}. (8.43) 


The value function J is still defined by: 











I = sup E 
(c) w) JE A™) 





T 
| U(t, {dt + T (VE) 
0 


. (8.44) 








The Lagrange parameter \‘”)* is determined from the budget equation. Its 
existence is from assumptions (U) and (V). Indeed, the function F“), defined 
by 














F)(y) = Ep | | ; KO (JI (yK (t), tht + 6 (T)T (yx”(7)) , (8.45) 
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has an inverse F()—1, 
Define the function G) by: 
GO (y) = H (F™-1y)), 


where the function H®) is given by: 





HO (y) = Ep 








(8.46) 





I f U(I(yx)(t)), t)dt + Ù [7 (ysr) 








PROPOSITION 8.2 
Under assumptions (U) and (V) for utility functions U and U, there exists 
an optimal strategy (6 * w) € A) (vo) such that: 


Tet w I (ce, w), vo) (8.47) 


= max 
c(v) sw) EA) (vo) 


T ~ 
J U(c), s)ds +U (ar) . (8.48) 
0 














where J(c, w), vo) = Ep 





Using the previous function Fe) and the inverse functions J and J of 
marginal utility functions U’ and U’, we have: 


Gp" = (FO v0) (0), 
VE = FPO Hvo) sT), 


and the optimal weighting w)* is deduced from the martingale representation 
(6.78). 


Introduce the class D’ of processes: 


D= {v € D, Fy) < œ, for all y a.s;} : (8.49) 
A) (ug) = {(c,w), wlw, t) € K, for P8 dt as. (w,t)}. (8.50) 


- The optimization problem: 


TO (ce, wer, vo) = JON, w), v9) (8.51) 


max 
cle) sw) EA(r) (vo) 


T 
| U(c), s)ds + Č Ga nee lesa 
0 














with J (®©), w® vo) = Ep 
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Then, under assumptions (U) and (V) for utility functions U and U, there 
exists an optimal strategy (c/*, w)*) € A“) (vo) such that, 

Using previous function F and the inverse functions J and J of marginal 
utility functions U’ and U’, we have: 


of" = FO uo) (8), 
vo = FF O(a (2). 


- Equivalent optimization conditions: 
PROPOSITION 8.3 
Suppose that for some process A € D’, we have: 
w e K, 6(A(w,t)) + tw Aw, t) = 0. (8.53) 


Then the pair (c O)* wA) ) belongs to AM (vo) and is optimal for the con- 
strained optimization polen in the original market. 


- Consider a solution (ci, wi) of the constrained optimization problem (A): 











sup 
(ck ,wK)€AK (vo) 





T 
; f U ATE LUA 
0 





Cvitanic and Karatzas [137] characterize the solution of Problem (A) by 
using the following conditions (B)-(E) for a given process A in the class D’: 
* Financibility of (c (a) we 
There exists a portfolio process wg such that (cœ, w®)) € Ax and: 
wh (w, t) € K, 5(A(w,t)) + tww, t) = 0. 
Vere” w t) = VO (w, t)for P8 dt as. (w, t). 


* Minimality of A. For every v € D, we have: 














<E 














T T 
j f U(t, dt + OV) ‘| U(t, ct” dt + Č (VE) 
0 0 














* Dual optimality of A. For every v € D, we have: 








T oe 
ip | | KOMIKO H, tat +6 (T)I (ys(7)) 




















IA 


T = 
is | | 6) (t) Tyr (8), that + KOTT (yT) 
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* Parsimony of A. For every v € D, we have: 





a 


Lp 











T 
/ KOE) dt + (TV | < vo. 
0 





THEOREM 8.4 Equivalence of conditions 
Conditions (B)-(E) are equivalent and imply property (A) with 


(ce, wi) = (CY, w™). (8.54) 


In addition, conversely, property (A) implies the existence of A € D' which 
satisfies conditions (B)-(E) with wi, = w™ under the following assumptions: 


(i) The utility functions satisfy: 


c— cU'(c) ; 
e soU) are nondecreasing on (0,00), (8.55) 
and for some a € (0,1), y € (1,00), we have: 

aU’ (x) > U' (yx), for all x > 0. (8.56) 


From the previous result, we are led again to the dual stochastic control 
problem: 











J(y) = inf E 


inf (8.57) 





T g (t, yx (t)) dt + y (OT) 


0 








THEOREM 8.5 Existence of a solution of the constrained opti- 
mization 

Under all previous assumptions on the utility functions, there exists an opti- 
mal pair (cy, w) for the constrained portfolio. 


8.2.2 Basic examples 


Examine the optimal solution for various constraints: no short-selling, up- 
per and lower bounds on the weighting vector, etc. 


8.2.2.1 Standard strategy constraints 


Example 8.5 General closed convex cone K 
We have: 

Oifeek 
OL panies 
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Recall that K is the polar cone of -K : 
K= {x E€ R? wa > 0,Vw € K}. 


Note that, for the unsconstrained case, K = R4, 


Oife=0 o x 
aS o otherwise ’ ai 


Example 8.6 No short-selling 


Let: 
K = {w € Rf, w; > 0,Vi=1,...,d}. 
Then: 
Oifre K Eem i 
C er and K=K. 


Example 8.7 Incomplete market 
Let: 


K = {w € R?, w; = 0,Yi = m + 1,...,d} , for some m € {1,...,d—1}. 


Then: 
Sla) ={° fy Se Sty, 0, 


co, otherwise. 


and z 
K = {a € R¢,2; =0,Vi= Lice mya 


This case corresponds to an incomplete financial market driven by a multi- 
dimensional Brownian motion, which contains m stocks with m < d. 
If, furthermore, there are no short-selling conditions then: 


K= {w € R1, wi; > 0,Vi < d and wi =0,Vi=m+1,..,d}, 
_ JO, (a1,...,2m) € [0, œ|” 
Be { oo, otherwise, 


and 7 
K = {x € RI, z; >0,Vi=1,...,m}. 
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Example 8.8 Rectangular constraints 


Let: 
d 


i=1 


for some fixed numbers: —oo < aj < pi < +00. 


Then: 
d d 
d(x) = 5 Bin, — 5 ait], 
i=1 i=1 
and `: 
K = {x € Rf, z; > 0,Vi € St and z; <0,vie S7}, 
where 


St ={i= 1,...,d|8; = +}, 
ST ={j=1,...,dla; =—co}. 


8.2.2.2 Logarithmic utility case 
Assume that U(c) = Inc and U(v) = Inv. Then, we have: for every v € D, 


vo 1 
T +1K®(T) 


vo 1 
T +16 (t) 














c(t) = VOT) = (8.58) 


Example 8.9 
Note that D = D’. Thus: 

















0 


14+7 
=-0+T) (1+1 t ) +E 


Vo 


ip | f KO (t) I (ye (t), t)dt + KOTI (ys(7)) 

















l a + fi : dt 
n —— n ——~ 
wT) Jo KOE]? 





and, 


ie [m aa =Ep if (o + afls) + 5 Ints) + a=) ds. 


Therefore, the optimization problem is equivalent to a pointwise minimiza- 
tion of the convex function 6(x) + 4 ||nt) + 07t (t)a| |? over x € K, for every 
te [0,7]. 

Thus the process À is determined by: 


























At) = arg min [26(2) + |In) +07 (a )] (8.59) 
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Finally, we deduce: 


wx (t) = o(t)'o(t)* AEH + w(t) — ren, 
VO 1 E V(t) 
THIKO(H 1+(T-t) 





cg(t) = 


[ 


For all previous examples corresponding to various constraint sets K, we 
have 6(.) = 0 on K. Thus, for the logarithmic case, the problem of determining 
the process A € D reduces to that of minimizing pointwise a simple quadratic 
form, over K : 

A(t) = arg min [25(x) + ||n(t) + 07 *(t)a| ’] f (8.60) 
eek 


For the unsconstrained case, A(t) = 0 and we recover the standard solution: 
wx (t) = o(t)'o(t)* [AH + ue) — re . 


Therefore, for the logarithmic case, we get explicit formulas: 


e For the incomplete case, set: 


p(t) 
where U(w,t) is an (m x d) matrix of full rank and p(w,t) is an (n x d) 
matrix with orthogonal rows that span the kernel of U(w, D for every 
(w,t) (we have: p(w,t)’p(w,t) = In and U(w,t)'p(w,t) = 0 and n = 
d- m). 
Then, set: 


M(t) = * (u(t), Hm(t)), 
a(t) = * (bm+1(t), ba(t), 
A(t) = POEET [B(t) — ren. 


We have: 
n(t) = A(t) + +o) [MH — r(t). 


For any v € K, necessarily of the form v(t) = ka for some N € R”, 


Ine) +o ell? = AE + to) (a(t) — rn +N) |". 
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Since A(t) and ‘p(t) (a(t) — r(t)In+N) are orthogonal, we have: 
[ne +o (|? = AMI? + [A (a(t) = rn +N)’. 
Therefore, the minimization 8.60 is achieved by the random vector, 


A(t) = | ea where A(t) = r(t)In — a(t). 


Thus, we deduce (result in Karatzas et al. [319]): 


oe Ea va — r(t)Im] 


For the rectangular constraints, the process A can be also determined: 
- First case (only one stock): St = S7 = Ø. 


Since d = 1, K has the form |a, 6] and ô(x) = Gx~ — axt. The process 
A is defined by: 


a(t) [o(t)@ — n(t)], if o(t)B < nlt), 
At) = 4 a(t) [ota — nt), if oltja > n(t), 
0, otherwise. 


Therefore, the optimal portfolio is given by: 


B if o*(t)n(t) > Ø, 
wK(t)= | aif ao (t)n(t) <a, 
ao '(t)n(t), otherwise. 


Note that the optimal portfolio wx(t) is equal to the unconstrained 
optimal one wea(t), as long as wea(t) is in the interval [a, 6]. Otherwise, 
wg (t) is equal to the closest point to wga(t). 


- Second case (two stocks): suppose for example that a = ‘(0,0), 8 = 
*(1,1), n = *(1,2) and 


a) = ie E 


For the unsconstrained case, the optimal portfolio is given by: 
wra(t) = a(t) n = *(—1/3,—4/3). 
The optimal constrained portfolio wx (t) is no longer the portfolio in K 


which is the closest one to the unsconstrained optimal portfolio wga (t). 
Otherwise, it would be equal to *(0, 0). 
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Indeed, the minimization: 


1 
A(t) = arg min | 5 In to7ta||’ — tast + *Ba7|,2€R?, (8.61) 
xEK 


leads to the value A = *(13.5,0), and the optimal portfolio is given by: 
w(t) = to~t (n+o7*A) = *(0,1/2), 


which means that we must not invest on the first stock and invest half 
on the second one. 
8.2.2.3 Deterministic coefficients case 


e We get feedback formulas: There exists a formal Hamilton-Jacobi-Bellman 
equation associated with the dual optimization problem (8.11): 


dQ 


OQ... 1 Q = OQ ~ 
Oe ee 3 Dude nE) +o *(t)2| |’ — v O(a) v O tU v) =0, 
(8.62) 
and A 
Q(T, v) = U (v). 


If there exists a solution Q which satisfies mild growth conditions, then 
the dual function satisfies (see Fleming and Rischel [231]). 


Suppose that ô(.) = 0 on K, such as for all previous constraint sets K. 
Thus the process 


A(t) = arg min [25(2) + ||n(t) + 07 *(t)a "] A (8.63) 


is deterministic and constant w.r.t. v. Then Equation (8.62) becomes: 


a 82 ð PS 
%2 2 2g = rus) + U(t,v) =0. (8.64) 





1 2 
+ [hol » 


Consider for instance the case U(x) = U(x) = = Then U(x) = U(r) = 
E with 6 = 7%. Therefore, the Cauchy problem (8.62) has a solution 
of the form: 


Q(t,v) = nv Pu(t), 


where the function u(.) is the solution of the following ODE: 


du 
Fr + h(t)u(t) +1=0, u(T)=1, 
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with: 


h(t) = p w at ||n(t) + o*(t)a |’ + ilo) + r(t)p. 


The process A is again deterministic and: 


At) = arg min [201 = adla) + [ne +a]. (8-65) 


Cvitanic and Karatzas [137] provide a general result for the deterministic 
case. 


PROPOSITION 8.4 

Suppose that r(.),b(.), and o(.) are deterministic and that there exists a deter- 
ministic process A(.) E D, which achieves the infimum of dual problem (8.11), 
and that: for some real numbers a > 0,3 >0, and M > 0, 


T(t,y) + J(y) + |7 (t, y)| + 





Fy) < M(y® +y~*),0<y<o. 


With all assumptions introduced in this section, the optimal process of con- 
sumption /investment (Ck, Wi) is given in feedback form w.r.t. the wealth 
V(t) by: 


M(t) = It FEV (8), 


—1 (à) 
wg(t) = -o (t) a(t) [AH + w(t) — r(e) TOO 


REMARK 8.3 This result shows that, for the deterministic case, the 
weighting ratios are still constant, despite the additional constraints on the 
portfolio strategies: 

wrilt) _ o(t)'o(t)* AE) + ule) = r(e) 


4 


wrat) oa) AW + ult) — rN; 
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8.3 Optimization with transaction costs 


As shown by Magill and Constantinides [368], Davis and Norman [149], and 
Shreve and Soner [470], the standard approach to utility maximization under 
transaction costs is based on the analytical study of the value function which 
usually leads to an optimal strategy corresponding to: 


e No transaction in a certain region. With a single risky asset, no trade 
is made when the stock weight lies in a given interval. 


e Minimal transactions at the boundary such that the weighting vector 
stays in the region. 


This kind of result holds for infinite horizon: it is always optimal to invest 
in the stock (even a small amount) if the rate of return is positive. 

For finite horizon, Cvitanic and Karatzas [139] show that the result may 
be quite different. If the difference between the stock return and the riskless 
return is non-negative but small, and the time horizon is also relatively small, 
then it may be optimal not to trade at all. 


8.3.1 The infinite-horizon case 


Consider the model introduced by Davis and Norman [149], and further 
studied by Shreve and Soner [470] who use the concept of viscosity solutions 
to Hamilton-Jacobi-Bellman equations. 

- The financial market consists of one riskless asset, the bond B, and one 
risky asset, the stock S, given by: 


dB; = Byr(t)dt, dS; = Si . (y(t) dt + a(t)dW;| 3 


where W is a standard Brownian motion, for t € [0,7], on a filtered proba- 
bility space (Q, F, (F:):, P). 


The processes r(t), u(t), and o(t) are assumed to be measurable, F;-adapted, 
and uniformly bounded on [0, T] x Q. Besides, the process o(t) is assumed to 
be uniformly bounded away from zero. 


- The trading strategy is a pair (L, M) of adapted and left-continuous pro- 
cesses on [0,7] with non-decreasing paths and L(0) = M(0) = 0. The process 
L (respectively M) denotes the cumulative purchases (resp. sales). It is the 
total amount transferred from bank account to stock (resp. from stock to 
bank account). 

- The transaction costs are proportional : 0 < A,u< 1. 


Optimization within specific markets 257 


- The portfolio holdings (Vg, Vs) corresponding to a given strategy (L,U) 
evolve according to: 





t 
Vp. = Vg o — (1 + A) + (L-w)M4 | VB,u (Tu — Cu) du, (8.66) 
0 
t 
Ten TE eae eae f va dr r awit (8.67) 
0 


- The solvency region is defined by: 
Syu = { (x,y) € R? |2+(1—p)y>0and24+(1+A)y> 0}. 


Denote respectively by ôf and OY the upper and lower boundaries of the 
solvency region. The investor’s net wealth is equal to zero on Of Udy. 

- The consumption rate is an adapted process denoted by c. 

- An admissible strategy (c, L, M) is such that 


P[(L(t), M(t)) € Szu, for all t > 0] =1. (8.68) 


- The maximization of expected utility from consumption is defined by: 














Gp. ibe Madea p eU lenat]. (8.69) 
0 


(c,L,M)ES) yp 


8.3.1.1 A special case 


Assume that the utility function is a power function: U(x) = x Suppose 
also that both processes L and U are absolutely continuous w.r.t. the current 
time: 


t co 
Li = | lidt and m= f mdt. 
0 0 
The Bellman equation to be solved for the value function 7 is 


max 


(c,l,mJES) u 
1 > 90°F (x,y) OT (x,y) OSG) 1 a OSG) 
5° rr a oy a a” 
OT (x,y) ƏT(z,y) _ OF (ay) — OF (x,y) 


The maximum is achieved for: 


ae 
c= ž 


Ox 
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:¢ OF (a, OT (x, -¢ OF (a2, OT (ax, 
Ro Zen > (14 ayêm andm= {0 eye 


oy 
0 if FEM < (1 +A) ew K if Sey < (1 -p) ev 


Oy Ox 


Thus, the optimal strategy is bang-bang: buying and selling either are made 
at maximum rate, or not at all. 


The solvency region splits into three parts: “buy” (B) , “sell” (S), and “no 
transaction” (NT). 





x+(1+A)y= 


FIGURE 8.1: Bond/stock space: directions of finite transactions and 
solvency region 


At the boundary between (B) and (NT) subsets, we have: 


OT (x,y) OF (x,y) 
co (1 + A) a 
At the boundary between (S) and (NT) subsets, we have: 
OT (ey) _ a- Te) 
Oy Ox ` 


Then, these boundaries are to be determined precisely. For this purpose, 
assuming that the value function satisfies an homothetic property: 


Fey) =v'9 (2), 
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where W(x) = J(x,1). Therefore, J is constant along the lines of slope 
(1+.)~+ in B, and (1 — »)~1 in S. Thus: 





V(x) = Dipti) £ < T0, 
a 
1 

W(x) = —B(x@+1+A)*, x> t, (8.70) 
a 


for some constants A and B and zo and zı defined as in the previous figure. 


THEOREM 8.6 Davis and Norman [149] 
Assume that: 
i) Well-posed condition: 6 > a [r+(u—r)?/o7(1—a)]. 
ii) Transaction costs: A € [0,co[, u € [0,1[ and max(A, u) > 0. 
Then: 
1) Power utility case: (U(x) = x*/a). Suppose that: 


r<p<r+(l—ajo’. 


Then, the optimal solutions are defined from Equation (8.70): Let NT 
denote the closed wedge {(x,y) e R+? = << +}. For (x,y) € NT — 


{(0,0)}, define 1/(1-a) 
a 


Then, the process cf = c*(L¥, MŽ), where both processes L% and Mě satisfy 
Equations 8.66 and 8.67, is optimal for any initial investment (x,y) in NT. If 
(x,y) ENT, then an immediate transaction leads to the closest point in NT. 
The processes L¥ and Mý are the local times of the wealth at the boundaries 
of the no transaction region NT with regions B and S: 





t 


t 
Li -| Tt(Vp,5,Vs,.)e0B}ALs and My a livs,s,Vs,)€8s}4Ms. 
0 0 


Thus, the optimal strategy is to trade minimally in order to keep the stock 
weight between (1+ 21)~! and (1 + 20)". 


2) Logarithmic utility case: (U(x) = Ina). Suppose that: 
r<u<r+ o°. 
There are constants To, %1, A and B such that: 


U(x) = ~ In [Aw+1-w), x < To, 





H(z) = Lin [B e+ +A], z>, (8.71) 
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Define the function C*(.,.) by: 


CEO 


Note that, if u = r and 6 > ar, then the optimal strategy is to close out 
any position in stock and to consume optimally from the bank account. 


8.3.2 The finite-horizon case 


Consider the model introduced by Cvitanic and Karatzas [139]. The finan- 
cial market consists of the same assets as in the previous case, transaction 
costs are still proportional and the trading strategies (L, M) are also defined 
as previously. 

- Introduce the following auxiliary martingales: D denotes the class of pairs 
of strictly positive (F;)-martingales (Zo, Z1) with: 


Zo(0) = 1, 21(0) € [So(1 — u), So(1 + A)], (8.72) 
and 
GM ei os Ziad), (8.73) 


S(t) 


where P is the discounted price of the stock: P(t) = 34 


The martingales Zo and Z,are the feasible state-price densities for holdings 
in bank and stock in the markets with transaction costs. From the martingale 
representation theorem, there exist predictable processes #9 and 6; such that: 


Zo(t) = Zo(0) exp if Oo(u)dW,, — sf oud (8.74) 


Z(t) = Z,(0) exp if 6; (u)dW,, — J P (upd ; (8.75) 


Denote Z§ the density process when there is no transaction cost. We have: 


t t 
1 
Zo(t) = Z§(0) exp lj 06(u)dW, — J) piP (u)du : 

with 
r(t) — w(t) 

a(t) 

- The utility function on wealth U : (0,+0c0o) — R satisfies usual assump- 
tions as in previous sections. The inverse of the marginal utility is again 
denoted by J: J = (U')7’. 


ot) = 
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- The maximization of expected utility from terminal wealth is solved as 
follows. 


The terminal wealth Vg r+is defined by: 


O ; fJ aA+Auifu< 0, 
VB, T+ = VB,T + f(Vs,r) with f(u) = { (1 Si ma tu SO. 
This means that, at the end of the management period, the investor liqui- 
dates his position on stock and transfers it to a bank account with transaction 
costs. 


For an initial holding Vs(0), we have to search for an optimal strategy 
(L*,U*) that maximizes expected utility from terminal wealth: 





























I (VB, Vso) = sup E[Vz,r + f(Vs,r)]- (8.76) 
(L,M) 
Consider the dual problem 
fa . ~ ( ,Zo(T) Vs,0 
Veo) = int JB O (cH) + Bec], er 


under the assumption that there exists a pair (Zð, ZI) € D that achieves the 
infimum, for all 0 < Ç < co. Additionally, for all 0 < Ç < oo, we have: 


(5) 


where Z§(T) is the density process when there is no transaction cost. 





Zo (T) 
B(T) 


Z§(T) 
B(T) 

















T (C, Vso) < œ and E | 


Note for example that this assumption is satisfied if Vso = 0, and either 
U(x) = In(a) or U(x) = x° /a. 
Consider the function T 


Zo(T) 
B(T) 











>r = 











ZT 
E ST) 7 (¢ 

B(T) 

Since the function T is continuous and strictly decreasing, there exists a 
unique ¢* such that 














: 


We deduce: 


Zo(T) 
B(T) 





sle 


Zy(T) 
B(T) 





) 


= V; pee el 
Bo + So 











E[Zi(T)]. 
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THEOREM 8.7 
(Cvitanic and Karatzas [139]) Under previous assumptions, there exists an 
optimal pair (L*, M*) such that: 





; ; 3 ZT 
V r+ = Var + f(Vs.r) =F (< it L) ; 


BD) (8.78) 
with the following property: 


L* is flat off the set {(O<t<T|R*(t) =1+A}, 
M* is flat off the set {(O<t<T|R*(t)=1-p}, 


and: 

















Vout ROVE, zl (e20) 1 


BO BO) ) B A | 2 


where R*(t) = Zi (t)/Zö(t) and Q% is defined by Zð = Ii, 


The following examples illustrate the fact that for finite-horizon, it may be 
optimal not to trade at all. 


Example 8.10 
Assume that the rate r is deterministic and the inital amount Vso invested 
on stock is equal to 0. In that case, we have: 


= aiot P(A) 


Thus, the infimum is achieved by any pair (Z}(.), Zř(.)) such that Zě(.) = 1 
(then, with (1 — u) < $Y <(1+2)). 

















Consider for instance Z{(0) = (1 + A)So and 67 (.) = o(.). In that case, 
(1, Z7 (-)) € D if and only if: 


0< J [u(s) — r(s)] ds < In k= 


This condition is satisfied if 





r.) Sl) Sr() + p for some 0 < p < zla = 
Also: 


The no-trading strategy (L* = 0, M* = 0) is optimal and leads to 


Ver = Ve, B(T) and Vg 7 = 0. (8.80) 


Optimization within specific markets 263 


Note that if u(.) = r(.), even with no transaction cost, it is not optimal to 
trade. However, if u(.) > r(.), the optimal stock weight is positive if there is no 
transaction cost. This is true even with transaction costs for infinite-horizon 
with constant coefficients, as seen in previous section. 


Example 8.11 
Consider the case ju(.) = r(.) deterministic and a strictly positive amount Vš o 
initially invested on the stock. In that case, we have: 


t 1 t 
Zo (0) = 1 and Zý (t) = (1 — 4) So exp i o(u)dW,, — a o°(u)au| ‘ 
0 0 
The optimal strategy is given by: 
L*(.) = 0, M*(.) = Vs olo,7)(.)- 


This corresponds to an immediate liquidation of the stock position. We 
have: 


ae 


V r+ = Vert f(Vsr) =J or 


) = (Ve0+Veo(1—4)) B(T). (8.81) 
and 
Vý = (VB, + Vso(1 — u)lo,r)(t)) B), Vo 4 = Vs,olqo,r)(t). (8.82) 


[ 


8.4 Other frameworks 
8.4.1 Labor income 


We examine the portfolio optimization problem of an investor who is en- 
dowed with a stochastic insurable stream throughout his lifetime. The in- 
vestor’s wealth is assumed to be non-negative over the lifetime interval, which 
implies that the wage income cannot be sold in the financial market. This liq- 
uidity constraint is examined by Cuoco [134], and by El Karoui and Jeanblanc- 
Picqué [190], who succeed in finding a closed formula. 


- The financial market: 
This consists of one “riskless” asset and d risky securities. 


- The riskless asset B satisfies: 


dB, = Biri dt, 
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where r; is the short interest rate and Bo = 1. 


- Let S be the price vector of d risky securities. Let (Q, P) be the probability 
space. Assume that S is defined from the following stochastic differential 
equation (SDE): 


dS; = Si Š (u(t, Si)dt + a(t, S)dW;) ; (8.83) 
where W = (W1, ..., Wn) is a standard d—multidimensional Brownian motion. 


- The information is modelled by the filtration F; generated by the Brow- 
nian motion (and as usual completed in order to contain all P-null sets). 


- The processes pi(.,.) = (f41,-.-,Ha)(.,-) and o(.,.) = [oi (.,-)];,; satisfy 
usual conditions which ensure that the previous SDE has one and only one 
solution. These processes are assumed to be F;-predictable and uniform- 
ly bounded. The matrix o(t,.) is invertible with bounded inverse for any 
te [0,7]. 


- There exists a predictable and bounded process 6, the risk premium vector, 
such that: 
oth: = Ut — rl, a.S. 


Under the previous hypothesis, the financial market is complete and without 
arbitrage opportunity. 

- Assumptions on portfolio strategies: At any time t, the investor receives 
an income at the rate e; and chooses the amount c+ per time unit, which 
is assigned to his consumption, and also the portfolio weighting w+ which is 
supposed to be self-financing. The cumulative consumption ie csds is an Fr- 


adapted-process with i csds < œ, P-as. 


Note that if we impose c > max(e;,0), then the optimization problem 
looks like the standard one, since the assumption of non-negative terminal 
wealth implies the liquidity constraint. When e; > 0,Vt > 0, the process e 
can be interpreted as a labor wage. When e; < 0, Vt > 0, the process e can be 
viewed as a constraint on the consumption process: the excess consumption, 
Ch = Ct — Et > =E. 


The portfolio weighting w+ is predictable and such that fe I|ws||? ds < œ, 
P-a.s., where ||.|| denotes the norm. 


Thus, the portfolio valueV; is an Ito process defined by: 


t d 


t t 
Vi = Vo +f Ts Vads + >| wey — y (Cs — es) ds. (8.84) 
0 sa] Jo Sis 0 
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Assumption on the income process: The income process (e;): is spanned by 
the market assets. There is no “extra noise” on the income dynamics. This 
means that (e;); is F,;-adapted. 
The process (ez); is also supposed to be square-integrable. 
- Assumption on liquidity constraint: The wealth process V satisfies V; > 0 
at any time during the management period [0, T]. This condition implies that 


the investor cannot borrow against future labor income. 


- Assumptions on utility functions: For any time t, let U(.,t) and U(., t) be 
two utility functions satisfying Vt € [0, T], 


- U(.,t) and U (.,t) are defined on R*, strictly concave, non-decreasing and 
continuously differentiable. 


- limgsoo U (z, t) = 0 and limz—o ZU(.,t) = 0. 


- U(.,.) and U(.,.) are continuous on Rt x [0,7] (for example: U(a,t) = 
e-?tu(x) with p positive scalar). 


Note that the marginal utilities ZU(c, t) and 2U( ., t) are non-decreasing. 
Therefore, their inverse functions J and J exist. 


As in previous sections, U and U denote respectively the convex conjugate 


functions of U and U. 


The maximization of the expected intertemporal utility along the time pe- 
riod [0,7] is the following optimization problem: 











max E 
c,w 





ie U(cs,s)ds + Ü (Vr,T)| , (8.85) 
0 





for a given initial budget (at time t = 0, the portfolio value V is equal to a 
given value Vo) and for a given income process e. 


Since the financial market is complete, all positive and square-integrable 
consumption/wealth plans (c, r) are replicated by square-integrable process- 
es (w, V) such that: 


d 
dV, = ri Vidt + XC wrorVs (dW: + bidt) — (cr — ex) dt, (8.86) 
=r 


Vr = ĉr. (8.87) 
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Assume that the consumption/wealth plan (c,€r) is such that c is a non- 
negative F;-progressively measurable process with fF lc,|ds < oo, and Er is 
a non-negative Fr-measurable random variable. Then consider the minimal 
equation of 8.86 which is: 





V =E 











T 
f Zt (cs — es)ds + Zi |F: ; 
t 





where (Z!),>1 is the shadow state-price process defined by the following for- 
ward equation: 


dZ: = —Z' (rsds + 0sdW 8) ,s >t, Z$ =1. (8.88) 
Thus, the liquidity constraint has the following form: 














where I; = E i Ztesds |F| is the unique solution of 8.86, when c = e = 0. 


The existence of an optimal solution can be proved in a general setting, as 
shown in El Karoui and Jeanblanc-Picqué [190] (see Theorems (3.3), (3.7), and 
(3.8)), by using the Kuhn-Tucker multiplier method to linearize the budget 
constraint, and by solving a stochastic control problem which is the dual of 
the unconstrained problem. 


Consider the standard Markovian framework: the state-price density Z and 
the income process e are Markov processes with 


dZ, = — Z; [rdt + 0dW:] and de, = —e; [uedt + o-dW;] , (8.90) 


where r, 0, He and Ce are constant coefficients. 


The free dual value function J is defined by: 














T a 
J(t,v,¢) =E |y (© [vZ s)| + vZtes) ds +U (vZ) les=e|, (8.91) 
t 








with A 
U |v,s|+ve+ L(I) =0, (882) 
where 
ƏT ƏT OF NV ggg PT 138I cae 
L(J) = a lp leo Ss? 0 Ape + 5 (on dee Fe 


nN 


and the terminal condition J (T, v, €) = U(v). 
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The dual value function ®(¢, v, e+) is defined by: 


H(t, V, er) = 














T i 
min z / (© [vZ Ds, s)| + vZ{Dses) ds +U (vZpDr_) \er =e], (8.94) 
t 


where D is the class of adapted, right-continuous, non-increasing processes D 
such that Do < 1 and Dr = 0. 


Examine the particular case of absolutely continuous parameters D. Then, 
the problem can be characterized by using variational inequality: 
O® 
— <0 
Ov ~~ 


Ü |v, t] tve +L(J) <0, 


O®\ /~ 
(2) (Ô lv, t] + ve + £(9)) = 0, 
(T, v,£) = 0, 
which can be written as 
ðb a 
max (=. U |v, t)] + ve + ea) =0. (8.95) 


Consider the boundary between the set 4 (t, v, £) |3 (t, v,€) = 0 } and the 
continuation region { (t, v, €) Pt, v,e) <0}. 


The variational equation is associated with an American put: 











W(t,v,e) = sup 
DED 





T 1 al 
y i Zt (-0' [vZi,s)] — es) ds — Ztl-=rU (vZp) ler = | ; 


t 





Under the risk-neutral probability Q, the process W = W, +0; is a Brownian 
motion and we have: 


dZ, = Zi [(-r + 0) dt — dW] and de; = — ez [(ue — cef) dt + oedW +] ; 
(8.96) 
Also, 


1 














T Bs 
W(t,v,c) = sup Eg |y (-0' [vZ s)| — es) ds — l,-rŪ (vZý) ler = | ; 
t 


Tat 


is solution of 
max (AW(t,v) — U(t,v)) =0, (8.97) 
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where 
ow ow ow 1 OA 
fre en ee 
Be ap te He el hoe aie 
1, 300 aw A 
tE degr t eEten a; -rý -U -—e. 


The optimal wealth is V;* = H(t, et, Z), where 


“al 


T 
i Zt (-0' [Zz w), s)|] — es) ds— Z U (Zp_(v)) lee =e 


t 








H(t,e,v) =E 





Using Ito’s formula, we deduce the optimal portfolio w* which is the solution 


of: 


„ôH aH i. 
-2: O - eoe | (t,£, Z¥) = Wt. 


Example 8.12 

A closed formula can be given for the optimal consumption plan in the infinite 

horizon case. Assume that the utility function U(c,t) is HARA: 
l-a 


U(c,t) = eta #1, then — U'(v, t)= (ve) 
a 


Rl- 


Suppose that: 


- Al: The certainty equivalent present value To of the lifetime labor income 
satisfies: 














Ip(€) =E / Zedi! < OO, 
0 


which means that r+ He — oe > 0. Then, we have: Io(e) = Be < œo with 
B = [r + He — oe6)* . 


-A2: The free dual problem is well-posed: 














0< A=E / Zt (vez)? a| < OO. 
0 


ESS)" 


For the free case, from Markovian properties, we deduce that the optimal 
free wealth process VF is given by: 


Note that 





sL 
a 


VJ (v) =A [ve Z] — Be, 
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with the multiplier v such that the initial wealth is equal to Vo: 
AIT? = Vo + Be. 
We deduce that the optimal consumption c* is given by: 


7 VJ w) + Be 
cj(v) = HW + Ber 


The optimal wealth process V/ is the solution of: 


DL 
a 


dVv,f = —Bde, + Ad (vZ,e*) 
= Be, (pedt + a-dW;) 


j (vi F Ber) (<0 -a)+ + (=e)) dt + Paw; l 


The optimal portfolio is given by: 











0 0 
wr — ZV o) + (o. + 2) Ber. 


For the constrained case, the wealth value V;(v) is deduced from the price 
of an American option written on the negative part of the free value: 


Vo(v) = Vol (v) + Jo(v), 


where 














hlv) = snp |z, (4 (ve Z.) F — Be, )| ; 


The associated one-dimensional stopping time problem is as follows. The 
value function Jo corresponds to an optimal stopping problem with a two 
dimensional Markov diffusion processes (ve Z;,e;,t > 0). Since the payof- 
f is homogeneous, this problem can be transformed into a one-dimensional 
stopping time problem: 


ak 
a 




















Zrer -i veř Z 
JOY) —_ supe | T (AY, — B)~| with Y, = ie ig 
g E et 
The function 2™ is the value function of an optimal stopping time problem 


E 
w.r.t. (AY — B), where the Markov process (Y;)¢ is a geometrical Brownian 


motion under the risk-neutral probability Q° with characteristics given by: 


dY; = Yı (uy + oy dW’) , Yo = v aet, 
2 
py = B7! — A71, o2 = Sto? + ee. 
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The optimal stopping problem is solved as follows. Since the value function 


eo) is convex and monotonous, the optimal stopping problem is equivalent 


to the search of an entrance time T(a) such that 
T(a) =inf {t: Y; < a}. 


To this stopping time is associated a “reward” given by: 


























T (y, a) = Ege le AE AY Ga - B) | = (Amin(a, y) — B) Ege [enei] l 
We can use the Laplace transform of the law of a Brownian entrance time 


to compute W(y,a). Let T (b, u) = inf {t : ut + W; = b} be the entrance time 
for the Brownian motion with drift (wt + Wi); 


E le TOL o,u)<00| = exp [bn — |b] Vi? +2a| A> 0. 














The process In(Y) is a Brownian motion with drift V given by: 





1 r— ô -— 6?/2 o2 bo 
V = uy — -02 = ———— | + ume — 2% — boe — —. 
HY — a a E LA 


Thus, the stopping time 7(a) corresponds to an entrance time for a Brow- 


nian motion: 
1 
T(a)=T (<n (£) =) , fora < y. 
Oy y Oy 














We have 
7 min(a, y) à 
Bo |e T(a)/ s ee |, 
y 
with 
1 
A=- Iv + 4/ V2 +207 B= 
Oy 


The optimal stopping time corresponds to the value of a which maximizes 
V(y,a) : 
_ AB ( 
~ A(1+A) 


* 


a note that: a* < 1). 


We have also: 


TJo(v) 


E 





en | 


= W(y, min(a*, y)) = (B — Amin(a*,y))* ( 7 


El Karoui and Jeanblanc [190] prove that there is a closed form solution for 
the optimal wealth associated with the constrained dual problem. It is based 
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on the free boundary of this problem: 


be) = (A). 


ABe 
Zo(v) = 0 if v > b(e), 
Iy 
Be ABeva 1 
= —_ ————_—_ T O B h i A 
Zo(v) LLA pe + Av £, otherwise 


Finally, we deduce that the optimal consumption function is given by: (x = 
Vo) 


t 
C* (Va) = min (vz, b(e)) 7? = e(max (y, a*)) with y = € tvz ®. 


The optimal consumption and optimal wealth are linked through a feedback 


formula: 
: B ABO. eNO č 


When the initial wealth value is equal to 0, the maximal value for the con- 
sumption function is a fraction equal to a* of the income stream. 


The fraction a* is equal to AGREE) and is strictly smaller than the fraction 


2 corresponding to the free problem. 


The following figure illustrates how the optimal consumption depends on 
the optimal wealth, according to relation (8.98). It is drawn in the plane 


(=, ©). The upper curve corresponds to the solution of the free problem. 


ETS 


The parameter values are: 





r = 0.1, b = 0.2, o = 0.1, He = 1, Ce = 0.1, y = 0.7 and 8 = 1. 
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Consumption 


2.5 


Wealth 
0.2 0.4 0.6 0.8 1 1.2 


FIGURE 8.2: Optimal consumption as function of optimal wealth 


8.4.2 Stochastic horizon 


If the investor does not know with certainty the time of exiting the market 
(e.g., retirement, death, etc.), the time horizon becomes uncertain. Therefore, 
we have to examine optimization problem such as: 











max E|[U(V-ar)|, (8.99) 





where 7 A T is the investment horizon. 


When 7 is a stopping time and the financial market is complete, the solu- 
tion is provided, for example, by Karatzas and Wang [322] and Richard [424]. 
Blanchet-Scalliet et al. [79] consider the case where the time 7 is a random 
time horizon and the conditional distribution of 7 given the available informa- 
tion at time t is known. The process F; = P [7 < t|F;] satisfies the so-called 
(G)-assumption. The process (F;), is non-decreasing and right-continuous. In 
the complete market framework, they examine in particular the case where 
F; admits a density function fp w.r.t. the Lebesgue measure. Then, problem 
(8.99) is equivalent to: 


T T 
f Uha + Ur) (1 f sæ) |: (8.100) 
0 0 


This latter problem is reduced to a standard stochastic control problem for 
which we can use dynamic programming. Using a mild assumption on the 
density f, this problem is solved through PDE (see Blanchet-Scalliet et al. 











max Ih 
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(79]) or BSDE (see El Karoui et al.[191]). However, they do not include either 
the case where F; = I,<,, nor the case where F has no density w.r.t. the 
Lebesgue measure. 


This the reason why Bouchard and Pham [84] introduced a more general 
wealth path dependent utility maximization problem: 











max Ih 





T 
f U(V)dF;|, (8.101) 
0 








where the process (F;); is non-decreasing and right-continuous. Note that 


Note also that, contrary to fixed time horizon, the whole path of the port- 
folio process must be taken into account. Therefore, the dual variables are 
stochastic processes and not simple Fr—measurable random variables. Thus, 
the optimization problem is not reduced to a static one. Bouchard and Pham 
[84] derive a dual formulation in a general incomplete semimartingale frame- 
work. Then, they provide a solution to the primal problem, using a calculus 
of variation on the primal problem. The model is defined as follows: 


- The financial market consists of one bond B chosen as numeriare and 
d securities (S;4);2. The process S is assumed to be a semimartingale on a 
filtered probability space (Q, F, (Fz), P). 

- Portfolio strategy and value: The numbers of shares (0i t)it, invested 
on each asset i at time t, is a predictable process, which is integrable w.r.t. 
the price process S. The portfolio value process (V;):, associated to strategy 
(Oi t)it is given by, for t € [0, T], 


d t 
W=W+ D> | Gi .dS,.0 (8.102) 
i=1 79 


The utility function U satisfies the same assumptions as in previous sections. 
The optimization is well-defined by assuming that the wealth process V is 


in the set 
< <] : 


To avoid degeneracy, it is assumed that P [Fr > 0] > 0. The random variable 
Fr is also supposed to satisfy E[Fr] < œo and, without loss of generality, 


|j aFi] =1. 














T 
A= ÊV ivt eV = Oas, an j f U- (V:)dF; 
0 
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PROPOSITION 8.5 Bouchard and Pham [84] 
Under the previous hypothesis and assuming also that: 

i) The financial market has no arbitrage opportunity (there exists at least 
one equivalent local martingale) 

it) The value function F of the problem 


nan 


Te 
Ty) =_ inf | O (Y;)dF; < 0, 
(y) YED) Jo (Yı )dF; 


where D(y) is the set of all processes Y such that Y > 0 w.r.t. the measure 
dm, which is defined by: 














T 
m(E x F)=E | Ipx rdky 
0 





Then, we have: 

i) Existence of a solution to the primal problem: for all Vo > 0, there exists 
a unique optimal solution V* in the set A. 

ii) Existence of a solution to the dual problem: for all Yo > 0, there exists 
a unique optimal solution Y* in the set D(Yo). 

iii) Duality relations: (notation: J = (U’)~+) for all Vo > 0, 














T 
Vi = J(YŽ) with VY = E Į J(Y,)Y i AF, 
0 





Example 8.13 Blanchet-Scalliet et al. [79] 

The financial market contains a riskless asset with rate r and a risky asset 
S which is assumed to be a semimartingale on a filtered probability space 
(Q, F, (F:)¢,P) solution of the SDE: 


dS; = Gi . [udt + odW;] ; (8.103) 


where W is a standard Brownian motion and p is constant and ø is a non- 
negative constant. This financial market is complete. Thus, there exists one 
and only one risk-neutral probability Q defined with the relative risk process 
n by n =o '[u—rl]. Assume that the deterministic time horizon is infinite 
(T < +00). Introduce: 











T(t, Vo) = sup 7 ti FDU (Vajdu l 





Case 1: For some deterministic function f, F(t) = IM f(u)du on [0,T] and 
F(T) = 1. The random time 7 is independent from the filtration and has the 
density f. 
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e If U(x) = Ina, we have: 
T(t, Vo) = p(t) + a(t) In Vo, 


Pose fi 
x» _ H $ 
Wy = z Vio 


=r- f sodu pt acte f f rape 





with 




















1 (p-r)? À 1 (p-r)? 
Ces = ,A= z [r] rt53 ee ¿ 
e If U(x) = = 0 <a <1, we have: 
T(t, Vo) = PEV, 
* HT * 
A (a—1)o2 *’ 
with 
ET Y i z Ta (per)? 
E at au Ls 
p(t) =e I E f(u)du, c=r— Qa-h ə ` 














w- 2) 
e If U(x) =e"*,r=OandE “s ie < oo, then we have: 





T(t, Vo) = pltje™ 
_ H 
Wy = ee 
with y 
D(t) = ey e™ f(u)du, C= 1(u at) 
t 2 0 


Case 2: F(t) = P[r <t|Ft] = KE u)du on [0,T] where f is solution of the 
following SDE: 
df. = fı (adt + bdW;) . 


The random time 7 is a stopping time w.r.t. the filtration. 
e If U(x) = Inz and a < 0, we have: 


T 
T(t, Vo, fo) = fo Cae oe f q(s)eC8ds + q(t) In Vol , 
t 


5 u—r+ob 2 
oi = (EE w, 


oO 
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with 
1 1 C1(T—t) 1 HoT 
q(t) G) oO o T ? 
u—rt+ob 1 f/u-rt+ob ? 
Cart u- (EER oe) 


e If U(x) = = and a < 0, we have: 


ve /i1 1 1 
t. V: = hx ee aay CT =Gf a 
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8.5 Further reading 


Standard results about dynamic portfolio optimization can be found in Korn 
[333]. As shown in Schachermayer [452], the asymptotic elasticity condition 
is related to the asymptotic relative risk aversion under mild assumptions. 

Investment-consumption models with transaction costs are examined in 
Akian et al. ([12],[13]). Quadratic optimization with transaction costs is an- 
alyzed by Adcock and Meade [6]. Bielecki and Pliska [70] study risk-sensitive 
dynamic asset management with transaction costs. Optimal portfolio man- 
agement with transaction costs equal to a fixed fraction of the portfolio, cor- 
responding to a portfolio management fee, is examined in Morton and Pliska 
[394]; the asymptotic growth rate (the “Kelly criterion”) is maximized on an 
infinite-horizon. It is shown that even with very low transaction costs, the op- 
timal solution leads to very infrequent trading. Duffie and Sun [177] examine a 
model where the transaction is also a fixed fraction of portfolio value and also 
with a proportional cost for withdrawal of funds for consumption. Assuming 
that the wealth is only observed at transaction times, it is optimal to trade 
at fixed deterministic intervals. Fleming and Soner [232] also provide details 
about viscosity solutions to portfolio optimization with transaction costs (see 
also Lions [358] and Zariphopoulou [510]). Cadenillas et al. ([97],[98]) also 
study portfolio optimization with transaction costs and possible taxes. Portfo- 
lio optimization with transaction costs based on impulse control is introduced 
in Korn [334]. Deelstra et al.[153] provide the dual formulation of the utility 
maximization under transaction costs. 
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Portfolio optimization with labor income has been also studied without 
the assumption that wealth is non-negative over the period of trading, by 
Karatzas et al. [317]. The investor is allowed to capitalize his wage income 
at some interest rate which is simply added to the current wealth. He and 
Pages [287] consider that the investor’s wealth must be non-negative. They 
prove that there exists an optimal solution by application of duality theory. 
The optimization problem is transformed in an unconstrained dual shadow 
price problem. Assuming that the asset price is a Markov-diffusion process, 
they solve the problem by the dynamic programming method when the labor 
income is a function of the asset price. Duffie et al. [174] examine the same 
kind of optimization problem by dynamic programming when the income pro- 
cess is uninsurable. Thus, the financial market is incomplete since the labor 
income cannot be duplicated by a portfolio. They provide a quasi-explicit 
solution of the H.J.B. equation when the utility function is HARA, and when 
coefficients are deterministic. They show that at a zero wealth, the investor 
consumes a fixed fraction of wealth while saving the remaining amount in the 
riskless asset. 

Portfolio optimization under partial information has been studied by Lakner 
[343]. Nagai and Peng [397] consider an optimal investment problem with 
partial information for a factor model. Jeanblanc et al. [296] examine utility 
maximization under partial information when asset dynamics are modelled 
by a jump-diffusion process. They assume that only the vector of stock price 
is observable. However, they prove that the optimization problem can be 
rewritten as a problem with coefficients depending on past history of observed 
prices. 

Portfolio optimization with possible bankruptcy can also be analyzed. For 
example, the investor is under the obligation to pay a debt until he declares 
bankruptcy. This type of problem has been examined by Cadenillas and Sethi 
[99], Lehoczky et al. [350], Sethi [460] and Jeanblanc and Lakner [297]. 

Dynamic benchmark optimization is analyzed by Browne [94], who searches 
to outperform a stochastic benchmark. Buckley and Korn [96] use impulse 
control to determine an optimal index tracking under transaction costs. 


Part IV 


Structured portfolio 
management 


“The sophistication of portfolio insurance users has grown as rapidly as the 
product itself... Sponsors are realizing the advantage of programs that pro- 
tect a fund’s surplus. Portfolio insurance is being applied to many different 
classes of assets besides equities; fixed-income and international investments 
are growing areas of application. An early criticism of portfolio insurance 
was that it reduced return as well as reducing risk. But users are discover- 
ing that portfolio insurance can be used aggressively rather than simply to 
reduce risks. Long-run returns can actually be raised, with downside risks 
controlled, when insurance programs are applied to more aggressive active 
assets. Pension, endowment, and educational funds can actually enhance 
their expected returns by increasing their commitment to equities and other 
high-return sectors, while fulfilling their fiduciary responsibilities by insuring 
this more aggressive portfolio. Compared with current static allocation tech- 
niques, annual expected returns can be raised by as much as 200 basis points 
per year... Dynamics can be used to mold a set of returns to virtually any 
feasible investor objective. It can be used to manage the risks of corporate 
balance sheets as well as investment funds. As such, it may represent the 
most significant advance to date in the science of financial engineering.” 


Hayne E. Leland and Mark Rubinstein, “The Evolution of Portfolio Insur- 
ance,” published in Dynamic Hedging: A Guide to Portfolio Insurance, (1988). 
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Chapter 9 


Portfolio insurance 


The purpose of portfolio insurance is to limit portfolio losses when sudden 
drops occur in the financial market, while allowing investors to benefit from 
potential rises of the market. Thus, very often, for insured portfolio values at 
maturity: 


e There exists a guaranteed amount (it means that the probability that 
the guarantee is violated must be equal to 0). 


e When the market rises, the portfolio return must also rise at (at least) 
a predetermined percentage of a given index return. 


Therefore, portfolio insurance generally requires to specify the guarantee 
constraint and the portfolio maturity. 

Two standard portfolio insurance methods are the Option Based Portfolio 
Insurance (OBPI) and the Constant Proportion Portfolio Insurance (CPPI): 


e The OBPI, introduced by Leland and Rubinstein [349], consists of a 
portfolio invested in a risky asset S (usually a financial index such as 
the S&P) covered by a listed put written on it. 


e The CPPI was introduced by Perold [406] (see also Perold and Sharpe 
[407]) for fixed-income instruments and Black and Jones [76] for equity 
instruments. This method uses a simplified strategy to allocate assets 
dynamically over time. 


This chapter provides: 


e First, some basic properties of OBPI and CPPI methods, such as their 
payoff functions, for instance in the geometric Brownian motion frame- 
work. 


e Second, their payoffs at maturity are compared by means of stochastic 
dominance, by means of four first moments of their returns and of the 
cumulative distribution of their ratio. 


e Finally their dynamics properties are examined and compared. It is 
proved that the OBPI method is a generalized CPPI where the multiple 
is allowed to vary. We will also focus on the dynamics of both methods, 
in particular their “Greeks.” In what follows, we use in particular results 
in Bertrand and Prigent ([60], [58], [59]). 
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9.1 The Option Based Portfolio Insurance 


Leland and Rubinstein [349] introduce the OBPI method which uses com- 
binations of standard securities (such as bonds and stocks) and options. 


To illustrate this method, let us examine the following example. 


Example 9.1 Why do we need options? 
Consider a given time horizon T (for example one year). 

- A first investor, denoted by J4, chooses to invest in a riskless asset B and 
in a risky security S (typically a financial index). 

- A second investor, denoted by Jz, chooses to invest in B and to buy a Call 
option C, written on the asset S. 

The amount in B of investor [2 corresponds to the discounted exercise price 
of the option. 

Assume that: 

- Both investors have the same initial total wealth Vo invested in the finan- 
cial market. 

- Each of them wants to recover at maturity a same given percentage p of 
his initial investment. 

Suppose that the riskless rate r is equal to 3%, and that the volatility of S is 
equal to 20%. Assume also that Bọ = 1 and Sp = 100. Within this framework, 
the value Co of the at-the-money Call option is equal to Co = 9.41. 

- The initial value Vo of investor I2, who buys the option, is given by: 


Vo = Soe"? + Co ~ 106.41. 


Then, the portfolio value VP of investor Jy at maturity T is equal to: 
VE = So + (Sp — So)* = 100 + (Sp — 100). 


This value is higher than the amount Sọ = 100, which corresponds to the 
guaranteed amount for investor Ig. It is the minimal amount that he recovers 
at maturity if he cannot exercise the option (if the price of S decreases). Note 
that the guaranteed percentage p is given by p = So/Vo ~ 94%. 

- The initial value Vo of investor Jı, who chooses a simpler strategy: buy 
and hold quantities a and b respectively in B and in S, without using an 
option, is given by 

Vo = aBo + bSo œ 106.41. 


Since investor J; wants the same guaranteed percentage as I2, he must invest 
the same amount in the riskless asset B. This implies that aBy = Spe~"? ~ 97. 
Therefore, he chooses the same percentage of investment in B (here, about 


91%). 
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Thus, the portfolio value VQ) of investor J; at maturity T is equal to 
VAP = aBoe"T +bSr ~ 100 + 0.094 x Sr. 
- Let us compare the two portfolio returns for the two cases: 


* First case: the growth of the asset price S is equal to 30%. 
* Second case: the asset price S decreases by —30%. 


For a rise of 30%, the portfolio return of investor J, is approximately equal 
to p+ (0.09) x 1.3, which gives a rate of about 5.7%. For a decrease of —30%, 
the portfolio return of investor J; is approximately equal to p + (0.09) x 0.7, 
which gives a rate of about 0.3%. 

For a rise of 30%, the portfolio return of investor Jj is approximately equal 
to p + (0.09)x (option return), which gives a rate of about 22.7%. For a 
decrease of —30%, the portfolio return of investor Jz is approximately equal 
to p, which gives a rate of about —6%. 

Therefore, when the risky asset price S increases significantly, the portfolio 
which contains the option has a much better return than the simple combi- 
nation of the two basic assets B and S. 

However, it is the converse when the risky asset drops. Nevertheless, in 
that case, the guarantee is always satisfied. Note also that, with respect to 
a riskless investment with return e"? ~ 3%, both portfolios clearly provide 
smaller returns when the financial market is bearish, but a riskless investment 
cannot benefit from bullish market. 

Note that the percentage that the portfolio of the first investor can acquire 
from a potential increase of the risky asset S' is rather low. Indeed, we have: 


(+) =p+(1—pe-"?) (=) ~ 0.94 + (0.09) (=). (9.1) 


Thus, only about 9 % of the risky return Sr/So can be obtained. 
Concerning the portfolio including the option, note that if the option can 
be exercised, its return is equal to: 


(=) =pt+(1- yer 2-1) ~ 0.94 + (0.95) x (= -— 1) . (9.2) 


Therefore, this portfolio can benefit from a high leverage when the financial 
market rises. 
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9.1.1 The standard OBPI method 


The OBPI, introduced by Leland and Rubinstein [349], consists of a port- 
folio invested in a risky asset S (usually a financial index such as the S&P) 
covered by a listed put written on it. Whatever the value of S at the terminal 
date T, the portfolio value will be always greater than the strike K of the put. 
At first glance, the goal of the OBPI method is to guarantee a fixed amount 
only at the terminal date. In fact, as shown in what follows, the OBPI method 
allows one to get portfolio insurance at any time. Nevertheless, the European 
put with suitable strike and maturity may be not available on the market. 
Hence, it must be synthesized by a dynamic replicating portfolio invested in 
a risk-free asset (for instance, T-bills) and in the risky asset. 


The portfolio manager is assumed to invest in two basic assets: a money 
market account, denoted by B, and a portfolio of traded assets such as a 
composite index, denoted by S. The period of time considered is [0,7]. The 
strategies are self-financing. 


The value of the riskless asset B evolves according to: 


dB, = Birdt, 


where r is the deterministic interest rate. 


The dynamics of the market value of the risky asset S are given by the 
standard diffusion process: 


dS = St [udt a odW;| , 
where W; is a standard Brownian motion. 


The OBPI method consists basically of purchasing q shares of the asset S 
and q shares of European put options on S with maturity T and exercise price 
K. 


VOBPI 


Thus, the portfolio value is given at the terminal date by: 


VOBPI — Sr + q(K — Sr)*, (9.3) 


which is also VO?! = qK + q(Sr — K)*, due to the Put/Call parity. This 
relation shows that the insured amount at maturity is the exercise price times 
the quantity q: qK. 

The value V,???! of this portfolio at any time t in the period [0,7] is: 


VBP! = gS; + qP(t, Si, K) = qK.e7"T-9 + qC(t, Si, K), (9.4) 


where P(t, S+, K) and C(t, S+, K) are the Black-Scholes values of the European 
put and call. 
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REMARK 9.1 Assume that standard no-arbitrage conditions hold to- 
gether with no market friction. Then, for all dates t before T, the portfolio 
value is always above the deterministic level gke~"(7—,, which shows that 
the OBPI strategy also provides a dynamic guarantee. 


REMARK 9.2 1) The amount insured at the final date is often expressed 
as a percentage p of the initial investment Vo (with p < e’”). Since, here, this 
amount is equal to the strike K itself, it is required that K is an increasing 
function of the percentage p, determined from the relation: 


pVo(K) = p(qK.e"” + gC (0, So, K)) = 4K. (9.5) 
Indeed, we have 
Co(K) _1l-pe 
a ak 
where Co(K) denotes the initial Call option value (for example determined 





(9.6) 


from Black and Scholes formula). Since the ratio ColK) does not depend on 
w, we deduce that the higher the guaranteed percentage p, the higher must 
be the strike K (for a given value So). 

2) For a given initial portfolio value Vo and for given values So and Po(K), 
the number q of shares is a decreasing function of the exercice price K, since 
q is determined from 
= Vo 
T So + P(K) , 
where P (K) denotes the initial Put value. 

Finally, note that for a given investment Vo and a fixed guaranteed percent- 


age p, the option strike K and the number of shares q are entirely determined. 


q (9.7) 


To simplify the presentation and without loss of generality, we shall assume 
that q is normalized and set equal to one. 


Then: 
VPBPI — Sr + (K — Sr)t = K + (Sr —K)t. (9.8) 
This function is increasing and convex w.r.t. the risky asset price Sr at 
maturity. Therefore, it has the typical features of the portfolio payoffs with 
guarantee constraints. Indeed, such portfolio can benefit from a market rise, 
since if the risky asset price is higher than the strike at maturity, the return 


is given by: 
Ve So So + 1 (K) ` 
1 


Thus, in that case, the percentage which is obtained is equal to IFANS; 


(for the previous example, this percentage is equal to 94.4%. Thus, for a return 
Sr/So equal to 30%, the return Vr/Vo is about 22.7%). 
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The portfolio profile V2?” is as follows (for K = 100): 


Portfolio value Vr 





100 Risky asset value S 


100 
FIGURE 9.1: OBPI portfolio value as function of S 


9.1.2 Extensions of the OBPI method 


Several extensions can be proposed. They are mainly based on other kinds 
of options to be included in the portfolio. Among them, more general polyno- 
mial options can be substituted for the standard European options. As shown 
in Chapter 10, the choice of such options can be based on risk aversion and 
utility maximization. 


9.1.2.1 Polynomial options 


Assume that the guarantee constraint is as follows. At maturity, the port- 
folio value Vr must be always higher than 


Fr = he (Sr). 


For example, consider a linear function he (Sr) = aSr +b. The investor has 
a fixed guarantee b whatever the market evolution, and also a minimal per- 
centage a of the potential market rise. Moreover, the underlying asset of the 
option is a power function of the risky asset price at maturity. 


Then, we are led to the following combination: 
Vr = d.S® + ([aSr + b] — d.8%)T = [aSr + b] + (d.S% — [aSr + b])*. (9.9) 
For a simple fixed guarantee, we get: 


Vr = d.S® +(K—d.8™)* = K + (d.S5® — K)”. (9.10) 
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The figure 9.2 provides an example of such a power option payoff. 


173% ,mu-6.8% ,sigma-15%,alpha-0. 1 K-90, Vo- 100 
380) 7 T T T T 
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Risky asset value 


FIGURE 9.2: Call-power option profiles 


Figure 9.3 provides examples of paths of call-power portfolios and of stan- 
dard OBPI one for three types of paths of the risky asset S$ and three different 
values of m. The OBPI strategy is obviously a special case of call power option 
with m = 1 (linear case). 

In order to compare the two methods, the same path of S (drop, rise and 
stability) corresponds to each row. 

We note that the guaranteed level strongly depends on the value of the 
parameter m. This result is quite intuitive since, in order to get a terminal 
payoff more and more convex, the investor must more and more bear the risk 
of a portfolio value drop. 


The first column provides the path of Call-power options with m = 0.6 < 1. 
Above the guaranteed level, the payoff is concave. For a bearish market, the 
portfolio value reachs the floor very quickly but remains always above the 
OBPI payoff. For a bullish market, the concavity of the payoff implies that 
the portfolio value is smaller than the OBPI one. Then, the concavity allows 
the “reduction” of the variations of the risky asset, as illustrated by the last 
graph in the first column. 


The second column corresponds to a medium case with m = 2. The guar- 
anteed level is reduced from 98.96% (previous case) to 83.72% of the initial 
invested amount, as illustrated by the first graph corresponding to a market 
decrease. The last graph shows that, contrary to the concave case, the risky 
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Power Call=98.96% Power Call=83.72% Power Call=72.33% 


m=3 
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FIGURE 9.3: Call-power option paths 


asset variations are amplified. 


The last column corresponds to the smaller guaranteed level with very 
volatile paths. The next table summarizes informations about the four first 
moments: mean, standard deviation, skewness, and kurtosis, and also the 
guaranteed levels for the different values of m. Let Vo = 100. 


TABLE 9.1: OBPI and Call-power moments 
m Guarantee E(Vr) o(Vr) Skew Kurto 


0.6 (concave) 98.96% 103.14 5.733 1.33 5.05 
1 (linear = OBPI) 94.72% 103.71 9.934 1.4345 5.59 
2 (convex) 83.72% 104.88 22.06 1.5971 6.17 
3 (convex) 72.33% 106.31 36.52 1.98 8.82 
4 (convex) 60.86% 108.06 55.07 2.61 14.02 
Asset S 0% 104.23 14.93 0.35 3.13 


The call-power moments are increasing functions of m, whereas the guar- 
anteed level is decreasing w.r.t. the parameter m. 
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The following two figures correspond respectively to the pdf and cdf of 
call-power options. 
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FIGURE 9.4: Call-power option pdf 


Cumulative probability function 
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FIGURE 9.5: Call-power option cdf 
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Note that all cdf graphs intersect each other, which means that no stochas- 
tic dominance exists. Besides, the higher the possible gain, the smaller the 
guaranteed level. 


Note that, for the case he (Sr) = aSr + b, the portfolio value Vr also has 
the following form: 


+ 
Vr =aSp+b+ d.Sr — aSr —b 
—S$ SS 
Q(S) 


The study of such portfolio payoffs (see for example [417]) requires the ex- 
amination of polynomial option properties. 


As shown by Macovschi and Quittard-Pinon [370], a polynomial option can 
be decomposed as a sum of power options. Indeed, we have to search for roots 
of the characteristic polynomial function associated to the payoff function. 
Let Q(S) = Ss a;S/ be a polynomial function with degree n, and let b 
be a positive scalar such that the polynomial function P(S) = Q(S) — b has 
exactly p non-negative roots A, Az2,...,Ap with A < Ag <,...,< Ap. Then the 
European call with strike b can be valued from the relation: 


max (Q($) — 6,0) => Ca(b) = Soa; (Ea (0) ` (9.11) 


j=l k=1 


where C; (H) is the power-j option, with strike H and maturity T. 


The following example illustrates the possible payoffs of such portfolios 
according to the values of parameters d and m. 


Example 9.2 
Assume that the financial parameters are given by: 





p=0.1, o = 0.259 = 100 and r = 3%. 





Assume also that the investor’s characteristics are the following ones: 





Vo = 100, T = 5, and a = 0.7, b = 90, d = 8.7253.10- 11, m = 5.8333. 





The portfolio value with guarantee constraint is given by: 
Vp* = 0.7Sr + 90 + max (8.7253.107**. S758 — 0.7Sr — 90; 0). 
The characteristic polynomial function associated to the payoff is equal to: 


P(S) = 8.7253.10711.$2:539 — 0.757 — 90. 
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It has only one positive root, A = 129.1958. 


From Equation (9.11), we deduce the polynomial option value: 


Ca(90) = D> a0; (A) 7) 
j=l 
= —0.7.C) (129.1958) + 8.7253.107'?.C3 (129.19585:8333) , 
where C;(K) and C3(K) are respectively equal to the standard Black-Scholes 


call value and the Black-Scholes value of the power-5.8333 option with strike 
K. 


Recall that the Black-Scholes power-m option value is given by: 
Vo — eT -m)T [SoN (di) = Ke"? N(d.)| F 


= gmel(r+3e?m)T(m—-1)] wd.) — Ke”TN (d), 


with a 
m o“m(m—1)T 
“in (ari) + (mr+ $ (mo)?) T 
dy = <>. 
5 movT 
and N 
d2 = dı = movT. 

Then, we get: 


CQ(90) = 18.9. 


Thus, we deduce the initial portfolio value which allows the investors to 
receive such guarantee at maturity: 


Vo" = 0.789 + 90.e7"T + Co(90) = 166.37. 


This investor is sure to recover 54.09% of his initial investment, at maturity. 
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The following figure indicates the terminal values of the portfolio as function 
of the risky asset values Sr. 


Terminal portfolio payoff 
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FIGURE 9.6: Convex case with linear constraints 


An example of concave profile with linear constraint: 


Terminal portfolio payoff 


r=3%, mu = 6.8%, c= 30%, œ = 0.1, K= 90, Vo = 100 
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FIGURE 9.7: Concave case with linear constraints 
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9.1.2.2 Other possible extensions 


Obviously, other options can be introduced to provide a percentage of the 
potential market rise. Indeed, any portfolio with terminal value Vr such that: 


Vr = K + Hr, (9.12) 
where Hr is a positive random variable is always above the level K. 


The choice of a particular payoff Hr may depend on: 


e Market predictions: rise or drop of the financial market, volatility levels, 
etc. 


e The type of risky assets: financial index, hedge fund, etc. 
e The insurance cost associated to each chosen derivative: lookback op- 


tions, corridor options, etc. 


REMARK 9.3 As seen in the next chapter, from a theoretical point of 
view, it may be optimal to use options on a power of the risky asset. 
[ 


From Relations (9.1 and 9.2), the reduction of the insurance cost can allow 
the investor to get a higher percentage of the market rise. 


Among possible options, consider for example the capped Call C7”: 
CSP — K + Min(Max(Sr — K,0); K’), (9.13) 


which always provide a guaranteed amount equal to K, is equal to the risky 
asset value Sr when K < Sr < K', but remains constant equal to K + K’ 
for high values of Sr. 


This allows reduction of the insurance cost and is profitable if the value Sr 
remains below K”. 


However, options must often be synthesized by a dynamic hedging strategy 
when they are not available on the financial market. 


Therefore, standard problems appear, such as imperfect hedging, transac- 
tion costs, and rebalancing strategy, which generally increase the insurance 
cost. 
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9.2 The Constant Proportion Portfolio Insurance 


The CPPI, introduced by Perold [406], uses a simplified strategy to allocate 
assets dynamically over time. The investor starts by setting a floor equal to 
the lowest acceptable value of the portfolio. Then, he computes the cushion 
as the excess of the portfolio value over the floor and determines the amount 
allocated to the risky asset by multiplying the cushion by a predetermined 
multiple. Both the floor and the multiple are functions of the investor’s risk 
tolerance and are exogenous to the model. The total amount allocated to the 
risky asset is known as the exposure. The remaining funds are invested in the 
reserve asset, usually T-bills. 







Portfolio value V 


Vo 


Cushion 
Floor F 


Fo 


Time 


FIGURE 9.8: Portfolio value and cushion 


The higher the multiple, the more the investor will participate in a sustained 
increase in stock prices. Nevertheless, the higher the multiple, the faster the 
portfolio will approach the floor when there is a sustained decrease in stock 
prices. As the cushion approaches zero, exposure approaches zero too. In 
continuous time, this keeps the portfolio value from falling below the floor. 
Portfolio value will fall below the floor only when there is a very sharp drop 
in the market before the investor has a chance to trade. 

In what follows, we refer mainly to Prigent [413]. 
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9.2.1 The standard CPPI method 


The CPPI method consists of managing a dynamic portfolio so that its value 
is above a floor F at any time t. The value of the floor gives the dynamical 
insured amount. It is assumed to evolve as a riskless asset, according to: 


Obviously, the initial floor Fo is less than the initial portfolio value VEF PE 


The difference VEF. PI _ Fy is called the cushion, denoted by Co. Its value C; 
at any time t in [0, T] is given by : 


C = VEPPI L A, (9.15) 


Denote by e, the exposure, which is the total amount invested in the risky 
asset. The standard CPPI method consists of letting 


et = mCi, (9.16) 


where m is a constant called the multiple. Note that the interesting case for 
portfolio insurance corresponds to m > 1, that is, when the payoff function is 
convex and can provide significant percentage of the market rise. 


Assume that the risky asset price process (S¢)s is a diffusion with jumps: 
dS; = Sy [u (t, S)dt + a(t, S,)dW; + o(t, S;)dy], (9.17) 


where (W;); is a standard Brownian motion, independent from the Poisson 
process with measure of jumps y. 


In particular, this means that: 


e The sequence of random times (Tn)n corresponding to jumps satisfies 
the following properties: the interarrival times Tn+1 — Tnare independent 
and have the same exponential distribution with parameter denoted by 


À. 


e The relative jumps of the risky asset 4 rte are equal to ô(Tn, Sr, ). They 





are supposed to be strictly higher ian = (in order for the price S to 
be stricly positive). 
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We deduce the portfolio value: 


PROPOSITION 9.1 
i) The value of this portfolio V,CPP! at any time t in the period [0, T] is given 
by: 

VOPPI (m, S;) = Foe + Ch, (9.18) 


where the cushion value C, is equal to: 


C; = Coexp (a —m)rt +m fu — 1/2mo?)(s, Ss)ds + i a(s, s)aw,] ) 


x [[ +m ôT, Sr,)) (9.19) 


0<Tp<t 
ti) When the risky asset price has no jump (6 = 0), and the coefficients 
u(.,.) and o(.,.) are constant, we deduce the following standard formula: 
VOPPI (m, St) = Foe" + a.S, (9.20) 


where 


at = (S) op and 8 = (« m(r 50) me). (9.21) 


PROOF - The portfolio value of the CPPI strategy is given by: 


dB ds, 
AV,CPPI = (VCPPI _ e, Gite A a 


The CPPI strategy is based on the following relations: VCP?! = O, + Fi, 
et = mC;, and the floor value satisfies dF; = rdt. Therefore, the cushion 
value is given by: 





dC, = d(V, CPPI AE F,), 
E (Ve PPI Lig 1) Be as (e1) 33 = 
= (CQ + F, — mO,) #24 + (m0 caer ae 
= (Cy — mCi) + (mO,) 48 oe 
= C,[r +m (u(t, Sz) — r) dt + mo(t, S4)dW; + md(t, St)dy]. 
Consequently, the cushion value is a stochastic exponential (the so-called 
Doléans-Dade exponential as shown in Appendix B). 
Thus, the cushion value C; at any time t is given by: 








Chet (a ae [fu Wome E Ea [ oe s.aW,] ) 


x [[ +m ôT, Sr,)) 


O<Tn<t 
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When the risky asset price has no jump and the coefficients u(.,.) and o(.,.) 
are constant, we have: 


2,2 
Cı = Coexp|((m(u — r) +r — Uaa 





jt + moW;))]. (9.22) 


Then, using the relation S; = So exp [oW, + (u = 507) t], we deduce: 


1 St l > 
W, = — |n ( =>] - ->= t| . 
TRE b) ( H je 
Therefore, substituting W; in the expression of C;, we have: 


SEN 1 o? a 
Ci (m, St) = Co (=) exp (> m (r zo?) mT) r| = az.5;", 


where 


Qt = (=) exp [Gt] and 8 = (- m (« 57°) me) : 
0 


Finally, the CPPI portfolio value is given by: 


yon (m, St) = Fo.e" + at S. 








REMARK 9.4 Consequently, the guarantee constraint is satisfied as soon 
as the relative jumps are such that: 


ôlTa, Sr) > —1/m. (9.23) 


Thus, when the risky asset jumps are higher than a constant, then the condi- 
tion 0 < m < —1/d allows the positivity of the cushion value. For example, 
if d is equal to —20%,we have m < 5. If d is equal to —10%,we have m < 10. 
Note that these upper bounds on the multiple do not depend on the proba- 
bility distribution of the jump times A Sr, . 


REMARK 9.5 - When the risky asset price is a geometric Brownian 
motion (ô = 0 and p and o are constant), the cushion value is given by: 


m2o2 


26 
Cy = Coe? Wet lr tmle—r)— lt with Co = Vo — Po. (9.24) 





In that case, the portfolio and cushion values are path independent. They 
have lognormal distributions, up to the deterministic floor value Fr for the 
portfolio. The volatility is equal to mo, and the instantaneous mean is given 
by r+ m(u — r). This illustrates the leverage effect of the multiple m: the 
higher the multiple, the higher the excess return but also the volatility. Since 
we have V,CPP! (m,.S;) = Fo.e"’ + a.S, the portfolio profile is convex as 
soon as the multiple m is higher than 1. | 
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9.2.1.1 Lévy process case 


Assume that 6(.) is not equal to 0, and u and o are constant. Suppose 
also that the jumps are iid with probability distribution H(dx), with finite 
mean E[5(T), Sr,)] denoted by b and E[6?(T;, Sr,)] < œ equal to c. The 
logarithmic return of the risky asset S is a Lévy process (see Appendix B). 
Then, we deduce: 


PROPOSITION 9.2 
- The first two moments of the portfolio values are given by: 


E = — P [r+m(p+bA—r)]t Pre” 
{ [Vi] (Vo — Po)e + Poe", (9.25) 


Var[Vi] = (Vo — Po)2e2lr+m(utbA—r)]t jem? (a7 +ea)e — 1]. 


- The pdf of the cushion C, is determined as follows. Let go denote the pdf 
without jump. The function go, is defined by: 





bse e mn (In| é] tr tm(u-r)— ))* (9.26) 


90,t(%) = ——=—— 
rv 2nr02m?t 
When jumps can occur, we have: 


ew iy” fi e EN) 
gcr(z) = 5 rm ie oe Iicn + my) 


n=0 


(9.27) 


where H” denotes the joint distributon of the relative jumps 6(T;, Sr,), i < n. 
Since they are assumed to be independent with the same distribution H, we 
deduce the relation: 


H™ (dyn, «1 dyn) = DH (dy). (9.28) 
Consequently, the pdf of the portfolio value is given by: 
fve(x) = gor(a — Poe”). (9.29) 


REMARK 9.6 From the mean-variance point of view, note that both 
mean and variance of the CPPI portfolio value are increasing functions of the 
multiple m and decreasing w.r.t. the initial floor value Fo. It is not possible 
to optimize w.r.t. the multiple m according to Markowitz criterion. Indeed, 
for a fixed mean level L the intial floor value Fo is a function of the multiple 
given by: 

Lh= Voel” +m(u+bà-r)]t 


Fo(m) = a — em] (9.30) 


which implies 


2 2 
Var|V;/Vo] = ( jee Oa. (9.31) 
GET YO) = Xe (1 = e m(u+bà—r)}t]2? g 
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which is a function of the multiple not having a real minimum. 

In fact, the expectation of gains is not the first goal: the guarantee level 
is crucial. When it has been determined, the multiple allows to adjust the 
percentage of anticipated profit when the financial market rises. 


The mean of the portfolio return is an increasing function of the multi- 
ple m. However, as mentioned previously in relation (9.23), the risky asset 
discontinuities lead to upper bounds on the multiple m. 

In order to reduce this constraint, another weaker guarantee condition can 
be considered which can be based on Value-at-Risk or Expected Shortfall 
criterion. For example, consider the VaR type condition: 


PIC, > 0,Vt<T]>1-e, (9.32) 
where e is “small.” This condition is equivalent to: 


ae = S16 (9.33) 


S` m 





Pit <T, 


Denote by H the cdf of relative jumps and asume that it is strictly increas- 
ing. Then, we deduce: 


PROPOSITION 9.3 
The condition 


PIC, > 0,Vt<T])>1-e 
is equivalent to the following condition on the multiple m : 
—1 


m< ao Gn) (9.34) 





PROOF The VaR guarantee condition fo a given threshold 1 — e is the 
following: 


PIC; > 0,Vt<T] >1-—e. (9.35) 


The cushion value provided in relation (9.19) shows that the variation due to 
jumps is equal to [pcp <,(1 +m 6(In, Sr, ). This term must remain positive 
with a probability equal to 1 — € at any time t. Therefore, each term in this 
product must be positive. This is equivalent to: 


P oe ETa Sr) > =y Sine (9.36) 
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Denote by Nr the number of jumps before maturity T. Since in the Lévy 
case jumps are independent from their occurence times, we deduce: 


P [oner [aTa Sr) > =y = 
5P [Pns [an Sr,) > =y P[Nr = k]. (9.37) 


m 
k 


The random variable Nr, which counts the number of jumps during the time 
period [0, T], is a Poisson distribution with parameter AT. Therefore, the VaR 
guarantee constraint at the threshold 1 — e is equivalent to: 
—1 
m < ~ a a (9.38) 
HOD (spln(74)) 





REMARK 9.7 The latter condition (9.38) provides an upper bound on 
the multiple which is obviously higher than the opposite of the infimum d 
of the range of the distribution H. Note also that this upper bound is a 
decreasing function of the jump intensity A. For high values of A, the upper 
bound converges to —1/d . 


9.2.1.2 Discrete-time case 


Suppose now that the investor trades according a discrete-time grid: ty, 
k < n. For example, he wants to limit transaction costs or portfolio assets 
are not sufficiently liquid. In this framework, the CPPI portfolio value can be 
determined as follows 

Denote by X; the opposite of the arithmetical return of the risky asset 
between tz_1 and tg. We have: 


St, = Stk 


X,=- 
Dies 


(9.39) 


Consider the maximum M of these values. For each l, define: 
Mı = Maz(Xı, wey Xi). 


Denote by Vp the portfolio value at time tk. The guarantee constraint is to 
keep the portfolio value Vz above the floor Pk. The exposure ep invested in 
the risky asset S% is equal to mCk, where the cushion value Ck is equal to 
V; — Pk. The remaining amount (Vp — ep) is invested in the riskless asset with 
return rg on the time period [tk, tk+1]- 
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Therefore, the dynamic evolution of the portfolio value is given by (similar 
to the continuous-time case): 


Vk+1 = Vk — ekXk+1 + (Vk om Ck) 41; (9.40) 
where the cushion value is defined by: 
Ceti = Ck [1 =- m Xk+ı + (1 A m)rk+1] A (9.41) 


Since at any time t, the cushion value must be positive, we get, for any 
k<n, 
—mX, + (1 — m)rk > —1. 


Besides, since rz is relatively small, the previous inequality is “approximately” 
equivalent to: 


1 


1 
Vk < n, Xk < — then also Mp = Mazx(Xk)k<n 
m < 


We deduce that the guarantee condition is satisfied as soon as the multiple 
m is smaller than d (equal to the infimum f the range of the distribution of 
the random variables X+). 


For a VaR guarantee constraint at the level (1— e€) on the time period [0, T]: 
P[C; > 0, Vt € [0,T]] > 1— e, 
the maximum Mn of the values X;, at times tẹ during [0,7] must satisfy: 
1 1 
Plvt, € [0, T], Xz < —] =P[M, < >] > 1 — €. (9.42) 
m m 


According to the distributions of the random variables X;, upper bounds 
on the multiple are deduced. 


Suppose for example that the random variables X;, are iid with cdf H which 
is assumed to have an inverse function HOC ®. 


Then, the following condition must be satisfied by the multiple m: 


1 
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9.2.1.3 Random rebalancing case 


When the cushion rises due to market fluctuations, the exposure may be 
close to the maximum amount that the investor wants to invest in the risky 
asset. When the exposure is below this limit, the investor can have a tolerance 
w.r.t. these market variations. It means that he can fix a percentage of market 
drops and rises such that, when market fluctuations are above this level, he 
rebalances his portfolio. 

Consider for example a lower bound m and an upper bound m on the 
multiple m*. The investor begins by choosing an initial floor Fo, a quantity 
o5, invested in the risky asset S and a quantity 0? invested in the riskless 
asset B. 

From initial conditions, we deduce: 
m*(Vo — Po) 


62 = a a (9.44) 


The portfolio is rebalanced as soon as the ratio (on is smaller than m or 


higher than m. If 02 < 3, then, for the geometric Brownian case, the 
condition 


et a 
<—< 9.45 
ee (9.45) 
is equivalent to: 
A<X <B, (9.46) 


where (X;); is a Brownian motion with drift, defined by: 
Xı = (u =r —1/207)t + oW, 
and the paramaters A and B are given by: 


2 n (Po-05 Bo) \ _ 7 
A=Ln mei ae v Te = Ln m—1l m* 4 
B=L m (Po—9% Bo) = Ff; 
= AT \ nT m= (Vo—Po) ) T 
The conditional distribution of the rebalancing times is characterized by 
the exist time of the process (X;), from the corridor {A, B}. This probability 
distribution is deduced from the trivariate distribution of the maximum, min- 
imum, and terminal value of the Brownian motion (see Revuz and Yor [423] 
or Borodine and Salminen [85]). 








The pdf of this joint distribution with a constant drift p is defined by, for 
any x in {A, B}: 
px pt 


g(x, A, B) = expl > - g2! * 


+00 
E a (oS) EEA), an 


Portfolio insurance 303 


where ¢ is the pdf of the standard Gaussian distribution and N its cdf. 
If A < 0 and B > O, then the distribution of the first passage time T is 
given by: 


PIT, < t] = 1 — P[|Mazs<tXs < B, Mins<tXs > A] =1- 





+00 n = o? B—pt—2n(B—A A—pt—2n(B—A 
ee ay [N( Fee) —N( eee) 
o? B—pt—2n(B—A)—2A A-—pt—-2n(B—A)—2A 
ERIN E = I 

After the firsttime Tı, the new portfolio value is determined from the fol- 
lowing relations: The new initial floor is equal to Pye™™?. The quantities 63, 
and GA respectively invested in S and B are determined from rebalancing 
conditions. We have in particular: 


m* (Vr, = Poe") 


a 
Op, = or 
1 


(9.48) 

From the Lévy property, the distribution of the new interarrival time Tə- Tı 
is deduced from the previous one, by stopping all processes at time Tı. Note 
that when jumps can occur but the logarithmic return is still a Lévy process, 
the previous cdf can be defined from infinite expansions. 


9.2.2 CPPI extensions 


The CPPI method is based on an exposure e which is a simple linear func- 
tion w.r.t. the cushion. It can be extended by introducing a more general 
exposure function e(t, x), defined on [0,7] x Rt, which is assumed to be pos- 
itive and continuous and has the following form: 


et = e(t, C:). (9.49) 
Consequently, the cushion is the solution of the following SDE: 


dCi = a(t, St, C;)dt + B(t, St, C,)dW; + y(t, St, C;)dp, 
with 
a(t, St, C+) = rC; + e(t, C;))[a(t, S+) P r], (9.50) 
B(t, St, C+) = e(t, Ci)o(t, St), 
y(t, St, Ci) = e(t, Cz )d(E, St). 


The positivity of the cushion is controlled by an appropriate choice of the 
function e(.,.): 


e If the cushion is null then the exposure e(t,0) must be equal to 0. 


e If the relative jumps as are higher than a fixed constant d (negative), 
then for any (t, x), we must have e(t,z) < —4z. 
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REMARK 9.8 The implications of this method can be analyzed by con- 
sidering conditions on buying and selling, probability distribution of the cush- 
ion value, etc. Indeed, a more general exposure function allows for a better 
performance since the portfolio manager has more available parameters to 
adjust the portfolio profile. 


Such a choice is compatible with tactical allocation and can be applied with 
fixed income instruments or hedge funds. For example, the multiple m can 
be dynamically chosen to take account of the implicit volatility of options of 
the financial market or other factors. 


We can also impose a more general guarantee condition 
Vr > Fr 
at maturity T, where Fr is a contingent claim such that for example, 
Fr > Poe". 


Sometimes, the investor wants to keep until maturity some part of the 
portfolio gains. For this purpose, the Time Invariant Protection Insurance 
(TIPP), introduced by Ested and Kritzman [213] can be used. This method 
allows to have at any time t: 


V; > kMaz(F;, sups<iVs), withO<k <1. (9.51) 


In particular, it means that the investor does not want to lose more than a 
given percentage of the maximum of previous portfolios values. Denote: 


Xı = Maz(F;, sups<tVs). 


In that case, the extended CPPI method is based on the new floor kX; and 
the new exposure: 
et = m (V; — kX). 


We can also consider a more general function of Max(Fi, sups<iVs) : 
et = e (t, Max( Fy, sups<tVs)) 


As presented in the next chapter, we can weaken the constraint Vr > Fr by 
imposing only this condition at a given probability threshold. The insurance 
cost is reduced but the guarantee is no longer sure. 

[ 
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9.3 Comparison between OBPI and CPPI 


In what follows we refer to Bertrand and Prigent [60]. We assume that 
the same initial amount Vo is invested at time 0, and also that the same 
guarantee K holds at maturity. We suppose also that the risky asset price 
follows a geometric Brownian motion. 


9.3.1 Comparison at maturity 
9.3.1.1 Comparison of the payoff functions 


Is it possible that the payoff function of one of these two strategies lies 
above the other for all Sr values? Since the initial investments are equal 
(VOPPI = VPP!) the absence of arbitrage implies the following result. 


PROPOSITION 9.4 
Neither of the two payoffs is greater than the other for all terminal values of 
the risky asset. Therefore, the two payoff functions intersect one another. 


Figure 9.9 illustrates a numerical example with typical values for the finan- 
cial market parameters: u = 10%, o = 20%, T = 1, r = 5%, K = So = 100. 
Note that as m increases, the payoff function of the CPPI becomes more 
convex. 








150 


FIGURE 9.9: CPPI and OBPI payoffs as functions of S 
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For this example, the two curves intersect one another for the different 
values of m considered (m = 2, m = 4, m = 6, and m = 8). 

CPPI performs better for large fluctuations of the market, while OBPI 
performs better in moderate bullish markets. 








9.3.1.2 Comparison with the stochastic dominance criterion 


The first-order stochastic dominance allows us to take account of the risky 
dimension of the terminal payoff functions for both methods. 


Recall that a random variable X stochastically dominates a random variable 
Y at the first order (X > Y) if and only if the cumulative distribution function 
of X, denoted by Fx, is always below the cumulative distribution function 
Fy of Y. 


PROPOSITION 9.5 
Neither of the two strategies stochastically dominates the other at first order. 


9.3.1.3 Comparison of the expectation, variance, skewness, and 
kurtosis 


Since option payoffs are not linear w.r.t. the underlying risky asset, the 
mean-variance criterion is not always justified. Thus, we examine simultane- 
ously the first four moments and the semi-variance of the rates of portfolio 
returns RQPPI and REPP!, 


PROPOSITION 9.6 
The equality of return expectations of both strategies, 














Re) = ERE), 














leads to a unique value for the multiple, denoted by m* (K), for any fixed 
guaranteed amount K. In the Black and Scholes framework, this multiple is 


equal to: m= 1+ (ee) mn (SOSH), (9.52) 


where C(0, So, K,x) is the Black-Scholes value of the call option, and x de- 
notes all possible values of the riskless rate. 


PROPOSITION 9.7 

For any parametrization of the financial markets (So, K,,0,17), there exists 
at least one value for m such that the OBPI strategy dominates (is dominated), 
in a mean-variance (mean-semivariance) sense, (by) the CPPI one. 
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The following example gives an illustration with the previous values of the 
parameters. 








The multiple m, solution of E[RQ?""] = E[REPP"), is equal to 5.77647. 




















The next table contains the first four moments and the semi-volatility for 
the OBPI with an at-the-money call, and for the corresponding CPPI with 
this particular value of the multiple. 


TABLE 9.2: Comparison of the first 
four moments and semi-volatility 


OBPI CPPI 
expectation 8.61176 % 8.61176 % 
volatility 16.8625 % 23.2395 % 


semi-volatility 9.1676% 7.7666% 
relative skewness 1.49114 9.70126 
relative kurtosis 5.4576 357.73 


The OBPI dominates the CPPI in a mean-variance sense, but is domi- 
nated by the CPPI if semi-volatility is considered. This is confirmed by 
the relative skewness. 


Nevertheless, the CPPI has a higher positive relative skewness than the 
OBPI, and should be preferred to OBPI for this criterion. 


e However, CPPI relative kurtosis is much higher than the OBPI one. 
This is due to the dominance of the CPPI payoff for small and high 
values of the risky asset S, as shown in Figure 9.9. 


Note that, here, owing to the insurance feature, kurtosis arises mainly 
in the right tail of the distribution. 


9.3.1.4 Comparison of “quantiles” 


Since the distributions to be compared are strongly asymmetrical, the study 
of the moments is not sufficient. The whole distribution has to be considered. 
The next figure loosely illustrates the situation, where both payoff functions 
and risky asset density are depicted. 
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FIGURE 9.10: CPPI and OBPI payoffs and probability of S 


200 


To examine the effect of probabilities, we study the distribution of the 
quotient of the CPPI value to the OBPI one. The plot of the cumulative 
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FIGURE 9.11: Cumulative distribution of opr 
T 
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This figure shows in particular that: 


For the at-the-money call (K = 100 and thus p = 94.72%), the prob- 
ability that the CPPI portfolio value is higher than the OBPI one is 
approximately 0.5, meaning that neither of the strategies “dominates” 
the other. 


This is no longer true for K = 90 (thus p = 87.97%), where the prob- 
ability that the CPPI portfolio value is above the OBPI one is about 
0.4. 


For K = 110 (thus p = 99.39%), this probability takes the value 0.7. 


REMARK 9.9 


This arises because the probability of exercising the call decreases with 
the strike. Recall that the strike K is an increasing function of the 
insured percentage p of the initial investment. Thus, as the guaranteed 
percentage p rises, the CPPI method is more desirable than the OBPI 
method. 


Notice that, for in- and out-of-the-money calls, extreme values of the 
quotient are more likely to appear: 


- On the one hand, the CPPI portfolio value can be at least equal to 
106% of the OBPI portfolio value with probability 5% (respectively 
about 0%) when K = 90 (respectively K = 110). 


- On the other hand, the CPPI portfolio value can be at most equal to 
94% of the OBPI portfolio value with probability about 0% (respectively 
18%) when K = 90 (respectively K = 110). 


The same qualitative results are obtained for other usual values of the 
multiple (m between 2 and 8), which confirms the key role played by 
the insured percentage of the initial investment. 
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9.3.2 The dynamic behavior of OBPI and CPPI 


In many situations, the use of traded options is not possible. For example, 
the portfolio to be insured may be a diversified fund for which no single option 
is available. The insurance period may also not coincide with the maturity 
of a listed option. Thus, for all these reasons, the OBPI put often has to 
be synthesized. In this framework, both CPPI and OBPI induce dynamic 
management of the insured portfolio. In what follows: 


e First, it is proven that the OBPI method is a generalized CPPI with a 
variable multiple. The study of this multiple allows quantification of its 
risk exposure. 


e Second, portfolio rebalancing implies hedging risk. Hence, hedging prop- 
erties of both methods are to be analyzed, in particular the behavior of 
the quantity to invest on the risky asset at any time during the man- 
agement period. 

For the CPPI method, the key parameter is the multiple. Indeed, it determines 
the amount invested in the risky asset at any time. Does there exist such an 
“implicit” parameter for the OBPI? 


9.3.2.1 OBPI as a generalized CPPI 


PROPOSITION 9.8 
For the geometric Brownian motion case, the OBPI method is equivalent to 
the CPPI method in which the multiple is allowed to vary and is given by 


mOPPN (t, S4) = Bo: r a (9.53) 


This coefficient is the ratio of the delta of the Call N (dı (t, S;)) multiplied 
by the risky asset price S;. It is equal to the risk exposure divided by the 
cushion value, which is equal to the Call value C(t, S;, K). 

In this framework, the OBPI method looks like a CPPI one. 


Note that: 


e The generalized multiple m°??!, associated to the OBPI method, is a 
decreasing function of the risky asset price S, at any time t. 


e The multiple m??”! takes higher values than usual CPPI multiples, 
except when the associated Call is in-the-money. 


e This implies that, for a bullish market, the OBPI method limits more 
the risk exposure. 


311 


Portfolio insurance 


The following two figures illustrate these properties. 


120 





80 
FIGURE 9.12: Multiple OBPI as function of S 


The OBPI multiple takes higher values than the standard CPPI multiple, 
except when the associated call is in-the-money. In particular, in a rising 
market, the OBPI method prevents the portfolio being over-invested in the 


risky asset, as the multiple is low. 
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FIGURE 9.13: OBPI multiple cumulative distribution 
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We now study the dynamic properties of the two strategies, and in particular 
their “greeks.” 


9.3.2.2 The Delta 


The delta of the OBPI is obviously the delta of the call. For the CPPI, it 
is given by: 
cppr _ OVP! -1 
The following figure shows the evolution of the delta as a function of the 
risky asset value Sz. 


reee a eC a a I Ie IIIa a aa er 














FIGURE 9.14: CPPI and OBPI delta as functions of S 


It can be observed in the previous figure that the behavior of the delta 
of the two strategies are different. For the CPPI, not surprisingly, the delta 
becomes more convex with m and the delta can be greater than one. 


For a large range of the values of the risky asset, the delta of the OBPI 
is greater than that of the CPPI. Moreover, this happens for the most likely 
values of the underlying asset (i.e., around-the-money). 
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In order to be more precise, the probability that the delta of the OBPI 
is greater than that of the CPPI has to be calculated for various market 
parametrizations. 

It can be observed that, in probability, CPPI is significantly less sensitive to 
the risky asset than OBPI, as shown in the following tables. Notice that this 
finding has important practical implications. 


TABLE 9.3: Probability P[AQC??! > ACPP1) for 
different m and a 


m c=5% o=10% o=15% c=20% o = 25% 
3 1.000 0.991 0.970 0.945 0.921 

4 1.000 0.987 0.961 0.930 0.876 

5 1.000 0.983 0.946 0.860 0.759 

6 0.999 0.978 0.884 0.748 0.672 

7 0.999 0.949 0.782 0.661 0.636 

8 0.999 0.881 0.685 0.616 0.630 

9 0.992 0.788 0.616 0.599 0.640 

10 0.96 0.69 0.58 0.60 0.66 


TABLE 9.4: Probability P[AC??! > ACPP!) for 
different m and u 

m L= 5% u = 10% u = 15% u = 20% u = 25% 

3 0.925 0.930 0.943 0.951 0.953 

5 0.861 0.860 0.850 0.830 0.801 

6 0.774 0.748 0.712 0.667 0.616 

7 0.706 0.661 0.610 0.554 0.495 

8 0.670 0.616 0.558 0.497 0.436 

9 0.657 0.599 0.537 0.475 0.413 

10 0.657 0.598 0.536 0.473 0.411 


The previous features are made clearer by examining the distribution of 
OB 

the ratio as The next figure shows that the probability that the CPPI 

delta is smaller than the OBPI one is a decreasing function of the strike K 


(or equivalently, of the insured amount). Note that, for small values of K, the 
range of possible values of the ratio ne spreads out. 

The following figure shows the evolution of the delta with time. Whatever 
the level of S compared to the level of the insured level at maturity, K, the 
delta of the CPPI is decreasing with time. For the OBPI, the evolution of the 
delta obviously depends on the moneyness of the option. 
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FIGURE 9.15: Cumulative distribution of Shppr at t = 0.5 
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FIGURE 9.16: CPPI and OBPI delta as functions of current time. 


More surprisingly, the delta of the CPPI is decreasing with the actual 
volatility (since m > 1). The same feature arises when examining the ve- 
ga of the CPPI, since they depend in the same way on this actual volatility. 


For the OBPI, the result depends on the moneyness of the option. 
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9.3.2.3 The Gamma 
The gamma of the CPPI is equal to : 


POPPI — One E? 
OS: 


= am(m — 1) S? 
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m=6 


— — Gama PPI, m=8 
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FIGURE 9.17: CPPI and OBPI gamma as functions of S at t = 0.5, for 
K = 100 


For the CPPI, it is always for high values of S that the gamma is impor- 
tant. Nevertheless, for usual values of m, the CPPI gamma is smaller than 
the OBPI one for a large range of values of S. This is particularly true for 
K =110. 


This fact is important, as the magnitude of transaction costs are directly 
linked to the gamma. Again, the CPPI method seems to be better suited 
when the insured percentage, p, of the initial investment is high. 


Moreover, the gamma of the CPPI is monotonically decreasing with time, 
although it does not reach zero at maturity. Recall that, for a call, the gamma 
will go to zero as the expiration date approaches if the call is in-the-money or 
out-of-the-money, but will become very large if it is exactly at-the-money. 
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FIGURE 9.18: CPPI and OBPI gamma as functions of S at t = 0.5, for 
K =110 


9.3.2.4 The Vega 
The vega of the CPPI is defined as!: 


CPPI _ OV EPEI 
vega = z 


= C (0, So, K) (4) ((m — m? )ot) exp [6t] 
= ((m = m?)ot) yor 


Thus, the sensitivity of the CPPI value with respect to the actual volatility 
is negative if m > 1. 
The higher the multiple, the more it decreases. 


REMARK 9.10 To summarize, comparison of OBPI and CPPI, with 
usual criteria such as first order stochastic dominance and various moments 
of their rates of return, does not allow one to discriminate clearly between the 
two strategies: 


- The standard CPPI method is based on dynamic portfolio management 
and thus seems more flexible. 


lIn the following calculation, we do not take into account the effect of the volatility on 
C(0, So, K) because the call enters in the CPPI formula only to insure the compatibility 
with the OBPI at time 0. Furthermore, C(0, So, K) depends only on the expected volatility 
and not on the actual one. 
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FIGURE 9.19: CPPI and OBPI vega as functions of S at t = 0.5 


- However, to avoid sudden drops, the multiple must be not too high. In 
such case, the OBPI strategy seems more robust if the option is well hedged. 
The worst scenario for the CPPI strategy compared to the OBBI one is a sud- 
den drop during the management period, then a financial market rise. In that 
case, for the standard method, the cushion remains null and it is no longer 
possible to benefit from risky asset price increases. 


- For small risky asset values, the CPPI method provides higher returns 
but, in that case, a simple riskless investment can beat the CPPI portfolio. 


- As the guaranteed percentage increases, the CPPI strategy is more rele- 
vant than the OBPI one. This arises mainly because the OBPI call has less 
chance to be exercised. 


- The analysis of the dynamic properties of these two methods shows in 
particular how the OBPI method can be considered as a generalized CPPI 
method. They differ mainly by their vegas. 


Note that implicit volatilities can be considered to better analyze the influ- 
ence of such a parameter since the Call option is priced from implicit volatility. 
For the CPPI method, a conditional multiple can be considered, based on fac- 
tors such as the realized or implicit volatilities. 

[ 
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9.4 Further reading 


As mentioned in Leland and Rubinstein (1988), long term returns can ben- 
efit from portfolio insurance methods since they allow investment in more 
agressive securities, while satisfying guarantee constraints. 


Bookstaber and Langsam [82] analyze properties of portfolio insurance mod- 
els. They focus on path dependence, showing that only option-replicating 
strategies provide path independence. They deal also with the problem of the 
time horizon and, in particular, time-invariant or perpetual strategies (stud- 
ied also in Black and Perold [77]). 


Black and Rouhani [75] compare CPPI with OBPI when the put option 
has to be synthesized. They compare the two payoffs and examine the role of 
both expected and actual volatilities. They show that “OBPI performs better 
if the market increases moderately. CPPI does better if the market drops or 
increases by a small or large amount.” 


Bertrand and Prigent [58] introduce general marked point processes to mod- 
el stock price variations. Upper bounds are provided in such framework, using 
VaR type criteria. Such an approach can be further extended with other risk 
measures. 


Time varying multiples can be introduced according to market turbulence 
factors, as proposed by Hamidi et al. [280] using quantile regression to esti- 
mate the potential losses, conditionally to these factors. 


Cesari and Cremonini [110] examine various dynamic asset allocations by 
using Monte Carlo simulations. Risk adjusted performance measures, such as 
Sharpe ratio, Sortino ratio, and return to risk are introduced, and strategies 
are compared for different market evolutions (bullish, bearish, and no-trend). 
CPPI strategies seems to perform better in bear and no-trend markets. Obvi- 
ously, benchmarking strategies are better in bullish markets, but such strate- 
gies provide no true “absolute” insurance against market drops, since they 
“only” must be close to the benchmark (relative performance versus absolute 
performance). 


Chapter 10 


Optimal dynamic portfolio with risk 
limits 


Portfolio insurance payoff is designed to give the investor the ability to limit 
downside risk while allowing some participation in upside markets. Such 
methods allow investors to recover, at maturity, a given percentage of their 
initial capital, in particular in falling markets. 


This payoff is a function of the value at maturity of some specified port- 
folio of common assets, usually called the benchmark. As is well-known by 
practitioners, specific insurance constraints on the horizon wealth must be 
generally satisfied. For example, a minimum level of wealth and some partic- 
ipation in the potential gains of the benchmark can be guaranteed. However, 
institutional investors for instance may require more complicated insurance 
contracts. 


As seen previously in Chapter 9, two standard portfolio insurance methods 
are the Option Based Portfolio Insurance (OBPI) and the Constant Propor- 
tion Portfolio Insurance (CPPI). 


However, to what extent are these methods optimal? 


The literature on portfolio optimality generally considers an investor who 
maximizes the expected utility of his terminal wealth by trading in continuous 
time, as seen in Part 3. The continuous-time setup is also usually introduced 
to study portfolio insurance (see e.g., Grossman and Vila [274], Basak [46], 
or Grossman and Zhou [276]). El Karoui, Jeanblanc and Lacoste [192] prove 
that, under a fixed guarantee at maturity, an option based portfolio strategy 
is optimal for quite general utility functions (see also Jensen and Sorensen 
[302] for a particular case and [415] for a more general guarantee). The key 
assumption is that markets are complete which means that all portfolio pro- 
files at maturity can be perfectly hedged. 


Other literature is devoted to the optimal positioning problem which has 
been addressed in the partial equilibrium context by Brennan and Solanki [88] 
and by Leland [347]. The value of the portfolio is a function of the benchmark 
in a one period set up. An optimal payoff, maximizing the expected utility, 
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is derived. It is shown that it depends crucially on the risk aversion of the 
investor. Following this approach, Carr and Madan [106] consider markets in 
which exist out-of-the-money European puts and calls of all strikes. As they 
mentioned, this assumption allows the examination of the optimal positioning 
in a complete market and is the counterpart of the assumption of continuous 
trading. This approximation is justified when there is a large number of op- 
tion strikes (e.g. for the S&P500). Due to practical constraints, liquidity, 
transaction costs, etc., portfolios are in fact discretely rebalanced. 


In this chapter, the optimal insured portfolio is determined for the two 
cases: 


e In the first section, the insurance is perfect, since the probability that 
the portfolio value is above the guaranteed level is equal to 1. 


- First, a one-period market is considered. Structured portfolios with 
payoffs defined as functions of the risky asset (a financial stock in- 
dex for example) are examined. Constraints on the horizon wealth 
are included. In addition, markets can be incomplete. The insured 
optimal portfolio is characterized for arbitrary utility functions, re- 
turn distributions, and for any choice of a particular risk neutral 
probability if the market is incomplete. 


- Second, the financial market is assumed to be dynamically complete. 
In this framework, results concerning European guarantees are ex- 
tended to more general guarantee constraints at maturity. For 
example, this guarantee can no longer be a fixed percentage of the 
initial investment, but can involve a stochastic component. 


e In the second section, the insurance is satisfied at a given probability 
level. 


- First, the maximization of the probability of success is studied. 


- Second, the expected utility maximization under VaR/CVaR. con- 
straints is examined. 


In particular, the optimal portfolio is calculated for CRRA utility functions. 
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10.1 Optimal insured portfolio: discrete-time case 


10.1.1 Optimal insured portfolio with a fixed number of as- 
sets 


In this section, we consider an investor who chooses a buy-and-hold strategy 
and can invest in three available financial assets: 


e A riskless asset B; 
e A risky asset S; and, 
e A put written on S with strike K and initial value Po(K). 


Note that this portfolio strategy is an extension of the OBPI method for 
which the number of shares for the underlying asset S and the Put written 
on S$ are equal. 

The investor’s strategy consists of setting constant porfolio shares where: 


a is the number of shares invested in B; 
B is the number of shares invested in S; and, 
y is the number of shares invested in the Put with strike K. 


Denote by Vr the portfolio value: 
Vr =aBr+BSr+ 7(K —Sr)*. (10.1) 
The budget constraint is given by: 
Vo = aBo + BS0 + yPo(K). (10.2) 
Assume that the guarantee constraint at maturity is linear: 
Vr > aSr + b, 


where a corresponds to a minimal percentage of the potential rise of the risky 
asset, and b is a fixed insured amount. 
Three main cases have to be considered. 


Case 1: 0 < 8 < y (the number of purchased puts is higher than the num- 
ber of purchased risky assets S). 


This asset allocation allows provision of the guarantee aBr + GK. Note 
that the portfolio profile is convex. However, this function is decreasing for 
small S values, which is not satisfactory for most investors. 
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FIGURE 10.1: Optimal portfolio profiles according to instantaneous stock 
return (convex but not increasing) 


Case 2: B > y > 0 (the number of purchased puts is smaller than the 
number of purchased risky assets S). 
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FIGURE 10.2: Optimal portfolio profiles according to instantaneous stock 
return (convex increasing) 
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In this case, the investor has a guarantee equal to aBr+yK. The portfolio 
payoff is still convex. 


Case 3: 6 > 0 > y (the investor buys S and sells the Put) 
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FIGURE 10.3: Optimal portfolio profiles according to instantaneous stock 
return (concave) 


The guarantee is still equal to aBy + yK. However, the portfolio profile 
is now concave. The investor will receive more money when the risky asset 
decreases, but less money when it increases. 

We now search the optimal portfolio for an investor who maximizes his 
utility function U, which is assumed to satisfy usual assumptions. 


REMARK 10.1 More generally, the guarantee constraints are infinite- 
dimensional since they must be satisfied for all S values (for example, if the 
probability distribution of S is lognormal). However, since both portfolio pay- 
offs and guarantee constraints are piecewise linear, the guarantee constraints 
are reduced only to three conditions: the portfolio values must be above the 
guarantee function for S = 0 and S = K, and the slope for high values of S$ 
must be higher than the slope of the guarantee function. : 


324 Portfolio Optimization and Performance Analysis 


Therefore, the optimization problem is the following: 


Max E[U(Vr)] 
| aBr+ 7K >b 








ath (guarantee 





aBr+BK >aK +b, 


constraint) Boa 
(budget 
constraint) Vo = aBo + BSo + 7Po(K). 


The solution is determined by using the Kuhn and Tucker approach. The 
Lagrangian is defined by: 
L = ElU(Vr)| + A(aBr + yK — b) + Ao(aBr + BK — ak — b) 
+A3(8 a) t Aa(Vo aBo — BSo 9 Po(K)), 





where A4 is the Lagrange multiplier associated to the budget constraint, and 
where parameters A; are associated to guarantee constraints. They are all 
positive and satisfy the necessary conditions of local optimality given by: 

















BL — E[U'(Vr)] + A1 + A2 — ae"? ~0, 
$5 = E[U’(Vr) Sr] + A2K + Az — AaSo =0, 
96 — E[U'(Vr)(K — Sr)+] + 1K -MP (K) = 0. 














Therefore, eight cases have to be considered according to the three guar- 
antee constraints which can be (or not) saturated. Three main results are 
observed: 

Strategy 1: The investor is weakly risk-averse. 

Strategy 2: The investor is risk-neutral. 

Strategy 3: The investor is rather strongly risk-averse. 

The weakly risk-averse investor chooses a portfolio which is equal to the 
guarantee for values of the risky asset S smaller than the strike K. The risk- 
averse investor has higher returns for small risky asset values, but smaller 
returns for higher values of S. According to risk aversion, the porfolio profile 
is concave or convex. 

Assume for example that the utility function is quadratic: 


U(a)=a2- 52", with a > 0. 


Then, the number of shares can be examined according to the aversion to the 
variance a: 

The higher the aversion to the variance a, the higher the amount invested 
in the riskless asset, and the smaller both the numbers of shares invested in 
the risky asset and in the put. 

These properties are illustrated by the following figures. 
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FIGURE 10.4: Optimal portfolio profile as a function of the risky asset 
value 


Bo = 150, So = 2500, Vo = 10000, K = 2400, r = 4%, o = 20%, u = 5%. 
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FIGURE 10.5: Optimal portfolio weighting as a function of the variance 
aversion (quadratic case) 


326 Portfolio Optimization and Performance Analysis 


10.1.2 Optimal insured payoffs as functions of a benchmark 


In this section, the financial market is to be composed of three basic financial 
assets: the cash associated to a discount factor N, the bond B, and the stock 
S (a financial index, for example). We suppose that the investor determines 
an optimal payoff h which is a function defined on all possible values of the 
assets (N, B, S) at maturity. As in Section 7.1, when there is no insurance 
constraint, the optimization problem is solved as follows: 

Assume that prices are determined under measure Q. Denote by a the 
Radon-Nikodym derivative of Q with respect to the historical probability P. 
Denote by Nr the discount factor, and by Mr the product Nr. 

The investor has to solve the following optimization problem: 








Mazzy, Ep|[U(h(Nr, Br, Sr)| under Vo = ip|h(Nr, Br, Sr) Mr]. (10.3) 




















Assume that h € L?(R*?,Px,.(dx)) where Xr = (Nr, Br, Sr) which is the 
set of the measurable functions with squares that are integrable on Rt? with 
respect to the distribution Px,,(dz). 


Now, the investor introduces a specific guarantee, which can be institution- 
al, or may imply an additional insurance against risk. If, for example, the 
interest rate is not stochastic, such a guarantee can be modelled by letting a 
function ho be defined on the possible values of the benchmark Sr: whatever 
the value of Sr, the investor wants to get a final portfolio value above the 
floor ho(Sr). For instance, if ho is linear with ho(s) = as + b, then, when the 
benchmark falls, the investor is sure of getting at least b (equal to a fixed per- 
centage of his initial investment), and if the benchmark rises, he make profits 
out of the rises at a percentage a. 

The optimal payoff with insurance constraints on the terminal wealth is the 
solution of the following problem: 











Map, Ep|[U(h(Xr)] 
Vo = Ep[h(Xr) Mr}, 
h(Xr) > ho(X7r). 


As can be seen, the initial investment Vo must be higher than Ep[ho(X7r)Mr] 
if the insurance constraint must be satisfied. 

The solution of this problem is given in Prigent [415]. To solve it, introduce 
the sets 



































A, = {he L?(R*, Px,)|Vo = Ep[h(Xr) Mr}, 








and 





H = {h € L?(R*°, Px,)|h > ho}. 
The set H = Hı N Hg is a convex set of L?(R*t3,Px,). Consider the 
following indicator function of H, denoted by ôy and defined by: 
_ JO ifhEeH, 
= ee ifh¢ H. 
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Since H is closed and convex, dy is lower semi-continuous and convex. 


Recall the notion of subdifferentiability (see, e.g., Ekeland and Turnbull 
[187] for the definition and properties of subdifferentials). 


Let V denote a Banach space and < .,. > the duality symbol. 


DEFINITION 10.1 1) For any function F defined on V with values in 
RU {+co}, a continuous affine functional |: V — IR everywhere less than F. 
This means that: 


Vu E V, Iv) < F(v) is exact at v* if (v*) = F(v"*). (10.4) 


2) A function F : V — RU {+00} is subdifferentiable at v* if there exists a 
continuous affine functional l(.) =< .,v; > —a, everywhere less than F, which 
is exact at v*. The slope vı of such an l is a subgradient of F at v*. The set 
of all subgradients of F at v* is the subdifferential of F at v* and is denoted 
by OF (v*). 


Recall the following characterization: 


Ue E€ OF (v*) iff F(v*) < +00 and W E€ V, < v — v*,ve > +F (v*) < F(v). 
(10.5) 
Denote by Ody the subdifferential of 64. The optimization problem is 
equivalent to: 





Mazn (E[U(h(Xr)] — 6x (h)) . (10.6) 











The optimality conditions leads to: 


PROPOSITION 10.1 
There exists a scalar Ac and a function he defined on L?(IR*?, Px.) such that: 


h* = Jeg + he), (10.7) 


where Ac is the solution of: 
Ve | Tyg) + he(2)]g(2)f (a)de, (10.8) 
0 


and he E€ 06x, (h*). 


To explain more precisely the condition he € 0d4,(h*), assume that the 
functions ho and he are continuous (such properties are always verified in 
practice). Consequently, the optimal payoff h* is continuous. 
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COROLLARY 10.1 

Under the above assumption, the function he satisfies the following property: 
1) If on a product I of intervals of values of Xr, h*(Xr) > ho(Xr) then he 
is equal to 0 on I. 

2) If on a product I of intervals of values of Xr, h*(Xr) = ho(Xr) then he 
is negative on I. 


REMARK 10.2 From the previous results, the optimal payoff h* can 
be determined by introducing the unconstrained optimal payoff hê associated 
to the modified coefficient Ae (i.e., hê? = J(Acg) ). Ac can also be considered 
as a Lagrange multiplier associated to a non insured optimal portfolio but 
with a modified initial wealth. Indeed, when hê is greater than the insurance 
floor ho, then h* = h®. Otherwise, h* = ho. However, the payoff is usually a 
continuous function of the values of the benchmark like any linear combination 
of standard options. In that case, the optimal payoff is given by: 


h* = Maz(ho, h®) = ho + Max(h® — ho, 0). (10.9) 


Consequently, the optimal solution is the sum of the constraint ho with a 
call of “strike” ho and underlying hê, which is the optimal solution without 
constraint. This result is true even in the non-complete case, and with a 
general guarantee constraint. In the next section (“dynamic” case), the same 
kind of result is proved (see Proposition 10.3). 


Since the general result in the previous proposition is an extension of the 
Kuhn-Tucker theorem to infinite dimension, it is well-known that the determi- 
nation of h* implies the comparison of all possible solutions of the kind (10.9). 
However, this problem can easily be solved if the payoff must be continuous. 


COROLLARY 10.2 

Under the previous assumptions on the utility U, there is one and only one 
continuous optimal payoff, associated to the unique solution Ac of the budget 
equation. 


PROOF From the assumptions on the marginal utility U’, we deduce that 
its inverse J is a continuous and decreasing function with: 


limos J = +00 and limi nod = 0. 


Thus, for all s, the function Ae —> h*(Ac,£) = Mas(ho(x), h(Ac, £)) is con- 
tinuous and decreasing. Therefore, the function Ae —> Eg[h*(Ac, Xr)] is 
continuous and decreasing from +00 to Eg[ho(Ac, Xr)], which is lower than 
the initial investment Vo. From the intermediate values theorem and by mono- 
tonicity, the result is deduced. 
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REMARK 10.3 Finally, an investment strategy is associated to the op- 
timal payoff, which can be computed by the following approach. Suppose for 
example that the interest rate is non-stochastic. Then: 


e First, the function h* is approximated by a sequence of twice differen- 
tiable payoff functions hy. 


e Second, as proved in Carr and Madan (1997), it is possible to explicitly 
identify the position that must be taken in order to achieve a given 
payoff h, that is twice differentiable. 


hn is duplicated by an unique initial position of hn(So) — hj,(So)So unit dis- 
count bonds, h’ (So) shares and h,(K)dK out-of-the-money options of all 
strikes K: 


An(S) = [hn(So) — hn (S0) S0] + hy, ($0) 


So oo 
+f h! (K)(K -—S)'dK + hi (K)\(S —K)* dk. 
0 So 


[ 


Generally, as mentioned previously, ho is increasing and hê also. Therefore, 
the optimal payoff is an increasing function of the benchmark. 


REMARK 10.4 From the previous theoretical result, we conclude that: 

- Generally, an optimal portfolio must include options in order to maximize 
the expected utility of investors. 

- The solution is a combination of the optimal portfolio value without guar- 
antee, and a put written on it with a strike equal to the floor. Under the stan- 
dard assumptions that the insurance constraints and the payoff are modelled 
by continuous functions of the risky asset, the solution is also the maximum 
between this function and the solution of the unconstrained problem but with 
a different initial wealth. 

- In the no guarantee case, the concavity/convexity of the portfolio profile 
is determined from the degree of risk aversion and from the financial market 
performance, for example a Sharpe-type ratio. This kind of result still holds 
according to the insurance constraint at maturity. : 


All the above properties are illustrated in the next example. 
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10.1.2.0.1 A special case 
Assume that the utility function of the investor is a CRRA utility: 


x 


Ua) ==, 


i. 


with 0 < a < 1, from which we deduce J(x) = ra=. 


Suppose that the interest rate r is constant, that the stock price evolves in 
a continuous time set up, and in particular that (.S;), is a geometric Brownian 
motion given by: 


St = Soexp [(u — 1/207)t + oW] . 


Denote by f the density of Sr. 


Notations: i 9 A 
H-T 2 2 
0 = ——, A= —-0 T + —(u — =0o°)T 
ma 5 +z 50 IT, 
o 0 
So)”, k= —. 
y= e^ (So) ry = 


Recall that in the Black and Scholes model, the conditional expectation g 
of a under the o-algebra generated by Sr is given by: 


g(s) = ys". 
Therefore, h°(s) satisfies: 


K 





h®(s) =dx s™ with d=c=7 and m= >0. (10.10) 


l-a 


We apply the previous general results to solve the optimization problem. 
Then, if there is no insurance constraint, the optimal payoff is given by: 


K rT i 
— SS aT , 10.11 
Fr sls) F eds © g(s) (10.11) 


If the insurance constraint is required then the optimal payoff must be the 
solution of 


h°(s) = 





Max, E| E7] 
Vo = eT E[A(Sr)] 
h(Sr) > ho(Sr) 























Then: 


PROPOSITION 10.2 
The optimal payoff with guarantee is given by: 


h* = (reg + he), (10.12) 
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where Ac is the solution of: 


y?, Moet = f IERO AOI: (10.13) 


and hc is a negative function satisfying the property of the previous corollary. 


COROLLARY 10.3 
Assume as usual that ho is increasing and continuous. Recall that we have 
h* = Mazx(h®, ho) and the solution h° of the unconstrained problem associated 
to the Lagrange multiplier Ac is increasing. Then, the optimal payoff is an 
increasing continuous function of the benchmark at maturity. 


h* = Maz(h®, ho) and the solution h® of the unconstrained problem asso- 
ciated to the Lagrange multiplier Ae is increasing. 


REMARK 10.5 As seen in Chapter 7, if there is no insurance constraint, 

the concavity/convexity of the optimal payoff is determined by the compar- 
ison between the risk-aversion and the ratio x = 43", which is the Sharpe 
ratio divided by the volatility o. 





i) hê is concave if k < 1l—-a. 
ii) h° is linear if k = 1 — q. 
iii) h° is convex if k > 1-a. 


The graph of the optimal payoff changes from concavity to convexity ac- 
cording to the increase of the risk-aversion of the investor. If, for example, the 
insurance constraint is linear (ho(s) = as + b), it looks like the unconstrained 
case, except when h* is equal to the constraint ho. | 


The previous theoretical result justifies the introduction of power options 
in the portfolio (see Chapter 9). 


As seen in Section 9.2, if k < 1— a, h** the optimal payoff is concave, and 
if k >1—a, h**, it is convex. 


As shown previously for the buy-and-hold case (one bond, one stock, and 
only a finite number of options written on it), if the guaranteed payoff is 
linear, the optimal (polygonal) payoff is still convave/convex according to the 
degree of risk aversion. 
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Example 10.1 
Consider the same parameter values as in Example 9.2. 

Then, the optimal payoff profile can be examined according to values of 
parameters d and m. Assuming that 





u= 0.1,0 = 0.2, Vo = 100, So = 100,r = 3, T = 5, a = 0.7,b = 90,0 = 0.7, 





the optimal payoff profile is equal to: 


with d = 8.7253.1071! and m = 5.8333. 


The terminal portfolio value is given by: 
Vr* = 0.797 + 90 + max (8.7253.107 1.S7:59°5 — 0.797 — 90; 0). 
Note for example that: 


a = 0.1 = d = 0.01 and m = 0.84, 
a = 0.9 > d = 1.55.1073% and m = 17.5. 


REMARK 10.6 When the number of available options is finite, the indi- 
rect expected utility is smaller than the previous one, where an infinite number 
of options can be used to replicate the portfolio value h** (s). 


This static replication is based on the following result (see, e.g., Carr and 
Madan [106}): 


Any payoff p(s) can be replicated by a position which is composed of 
e [p(So) — (Op/Os) (So) So] shares of riskless asset; 
e |(p/ðs)(So)| shares of risky asset S; and, 
e [(3?p/3s?) (K)] shares of out-of-the money options for all strikes K. 


However, for a given finite set of available options, we can consider either a 
combination p(s) of these options which minimizes a given loss function w.r.t. 
the optimal payoff h**(s), or the optimal solution for the given utility function 
with the finite set of available options. : 
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10.2 Optimal Insured Portfolio: the dynamically com- 
plete case 


10.2.1 Guarantee at maturity 


Assume that the financial market is complete, arbitrage free, and friction- 
less. Asset prices are supposed to follow continuous time diffusion processes. 
According to the investor’s risk aversion and horizon, the portfolio manager 
chooses the proportions to invest on financial assets, among them all zero- 
coupon bonds for maturities T defined on an instantaneous interest rate (r+). 
The resulting portfolio value (V;); is self-financing. This means that the pro- 
cess (V; exp(— is r,ds)) is a Qmartingale where Q is the risk-neutral proba- 
bility. 


Denote by 7 = a the Radon-Nikodym derivative of Q with respect to the 


historical probability P. Denote also by Mr the process nr exp(— i rds). 
Due to the no-arbitrage condition, the budget constraint corresponds to the 
following relation: 























wh 
Vo = Eo[Vr ep f rsds)| = Ep|Vr Mr]. 
0 


Assume that the investor wants to maximize an expected utility under the 
statistical probability P. As usual, the utility U of the investor is supposed 
to be increasing, concave, and twice-differentiable. Suppose also that the 
marginal utility U’ satisfies: 





lim,,U’ = +œ and lim,..U’ = 0. 


Denote by J the inverse of the marginal utility U”. 


The guarantee constraint consists in letting the portfolio value Vr at ma- 
turity above a floor Fr. This floor may be deterministic, corresponding, for 
example, to a predetermined percentage p of the initial invesment Vo, or may 
be stochastic if, for instance, the investor wants to benefit from potential 
market rises. For example, this floor may be equal to 


Fr = aSr + b, 


where a is a given percentage of the benchmark S (a stock index, for instance) 
and b is a fixed guaranteed amount which corresponds usually to a fixed 
percentage of the initial investment. In all cases, it is assumed that there 
exists a portfolio that duplicates the floor Fr. 

Then, for a given initial investment Vý, the investor wants to find the 
portfolio @ solution of the following optimization problem: 


Maxo Ep[U(Vr)| under Vr > Fr. 
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Due to market completeness, this problem is equivalent to (see Cox-Huang 
[132]): 











Maxy, Ep|U(Vr)] (10.14) 
under Vr > Fr and Vo = up| Vr Mr] > Ep|Fr Mr]. 





























PROPOSITION 10.3 

The optimal solution Vz of problem (10.14) is given by the maximum of the 
floor Fr and the solution VÈ of the non constrained problem for an initial in- 
vestment Vy such that Vý = Ep|Maa(Vf, Fr) Mr]. Equivalently, this solution 
can be viewed as a combination of the portfolio value Ve and a put written 
on it with “strike” equal to the floor, or a combination of the floor and a call 
written on the portfolio value V§. 














VĚ = VE + (Fr —Vé)* = Fr + (V£ — Fr)’. 


PROOF Consider the solution V7 of the free problem (without guarantee 
constraint). Using Cox and Huang [132] results, this solution is given by: 


Vr = J(aMr), 














where the Lagrangian parameter a is such that Vf = Ep[V¢Mr]. 
Furthermore, for any portfolio Vr with initial investment Vo satisfying Vr > 
Fr, since the marginal utility U’ is concave, we have: 
U(Vr) —U(Vp) < U'(Vr) (Vr — Vp), 
and since U’ is decreasing, we deduce: 
U' (V2) (Vr — VF) = Min(aMr, U'(Fr))(Vr — VP). 
Additionally, 


Min(aMr, U'(Fr))(Vr — VŽ) = aMr (Vr — Vž) — [aMr — U' (Fr)|" (Vr — Fr). 


























Finally, since Ep|Vr Mr] = Vo = Ep[Vž Mr], we get: 




















‘p[Min(aM, U'(Fr))(Vr — V¢)| = —Ep|[aMr — U'(Fr)|* (Vr — Fr)| < 0. 








Therefore: 














ip[U(Vr)] < Ep[U (VÈ )]. 
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10.2.2 Risk exposure and utility function 


Call-power options and CPPI strategy can be chosen according to optimal- 
ity criteria, as seen in the next examples. In what follows, we assume that 
the interest rate r is constant and that (S;), is a geometric Brownian motion 
given by: 

St = Soexp [(u — 1/207)t + oW] . 


As done previously, we denote: 





= 1 0 
=F" A=-PT + u= PT, 
o 2 
A 2 9 
=ef (S0) 7, k= L. 
b= e*(So)7, k= 


Recall that in the Black and Scholes model, the conditional expectation g 
of ae under the o-algebra generated by Sr is given by: 


g(s) = ps". 


Example 10.2 
Assume that the investor has an HARA utility U given by: 


(z-K)* 


U(x) = 


Then, the portfolio value Vr which maximizes the expected utility is given 
by: there exists a non-negative constant Ç such that 


Vr(Sr) = K + (S77). 


Thus, it corresponds to a CPPI portfolio value with guarantee K and a mul- 
tiple equal to — 


1l-a’ 





Example 10.3 
Assume that the investor has a CRRA utility U given by: 


U(x) = 


gea 
Q 


Then, the portfolio value Vr which maximizes the expected utility with the 
additional guarantee constraint K at maturity is associated to a non-negative 
constant € such that: 


Vr(Sr) = K + (682% — K)*. 


Thus, it corresponds to a call-power portfolio value with guarantee K and a 
power equal to y+. 
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Example 10.4 
Assume that the investor has a CRRA utility U given by: 


Ue) ==" 
a 
Then, the portfolio value Vr which maximizes the expected utility defined on 
the cushion Vr — K is given by: there exists a non-negative constant x such 
that E 

Vr(9r) = K + (xSp*). 


Thus, it corresponds also to a CPPI portfolio value with guarantee K and a 
multiple equal to >. 
[ 


REMARK 10.7 

- For a given profile h(Sr), under some mild assumptions, a utility function 
U can be identified (up to a linear transformation) such that the solution of 
the expected utility maximization is the profile h(Sr). 


Assume for instance that h is defined and invertible on a given interval of 
R. Then, the condition: there exists a non-negative scalar À such that 


U'(x) = Ag(h~"(«)) 


insures that h(Sr) is the optimal profile for the expected U maximization for 
some initial amount Vo. 


- For standard options (generally not invertible), U must be determined for 
each case. For instance, if 


Vr = Vo + (Sr - K)*, 
then we can consider the utility function: 


a-Vt+kK 


U(2) = (-00)lecvo + “2 


g(a a Vo T K)le>v- 


Therefore, the choice of a particular insured portfolio can determine an 
implicit utility function. 
[ 
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10.2.3 Optimal portfolio with controlled drawdowns 


As in Grossman and Zhou [275], consider an investor who wants at any time 
to lose no more than a fixed percentage of the maximum value his portfolio 
has achieved up to that time. 


Denote by M the maximum value of the investor’s wealth V invested on 
the portfolio, on or before time t. Thus the constraint on the value process 
V is as follows: there exists a constant A in [0,1] such that at any time t in 
[0, T], 

Mi > XV;. 

- The financial market contains a riskless asset with rate r. The price of 
the risky asset S is the solution of the following SDE w.r.t. the probability 
space (Q, F, (Fije, P): 
where W is a standard Brownian motion, and the usual assumptions on co- 


efficient u and g are made. Denote 4 = u — r. This financial market is 
complete. 


- The portfolio value V is the solution of: 
dV; = V; [rdt + ws (dt + odW;)] , 


where wg is the portfolio weight invested on the stock S which is assumed to 
be predictable w.r.t. the filtration (F+)+ and such that the wealth process V 
is always positive. 

- The drawdown control: Denote Mo as a positive amount invested at time 
0 which evolves at a growth rate 6 with 6 < r. Define: 


M; = max (Moe™, Vet) 5 < t) (10.16) 


Note that, when ô = 0, M; denotes the highest value between the initial val- 
ue Mo and all the portolio values on or before time t. If ô — —oo, M, goes to 0. 


The portfolio value must always be above the stochastic floor AM : 
vt € [0, T], Vi > AMi, a.s. (10.17) 


The term ( — tt) is called the drawdown. Therefore, the drawdown 


control condition is: for a given level A, 


t 


(1 = ir) >1-À. (10.18) 


- The utility function: Assume that the investor has a power utility function 
U(x) = = with a < 1 and a Æ 0. Suppose that the objective is to maximize 
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the long-term growth rate of the expected utility. Its upper value is given by: 














€* = sup lim inf —n(E [aU (Vr)]), (10.19) 


wseEA T-0 W 


where A is the set of weights wg such that condition 10.17 is satisfied. 


Recall that, when A = 0, the optimal investment strategy is given by: 





ae 1 =e 
E E ee 


s= IIa o? 2(l1-a) œ? 


Note that when à > 0, this strategy violates condition (10.17) with proba- 


bility one. 
Denote M, = e7% M; and V, = e~®V,. 


PROPOSITION 10.4 Grossman and Zhou [275] 
If there exists a constant € and a function J (V, M) such that: 
i) T(V, M) is solution of the Bellman equation: 














I(V,M), = sup t [aI (V, MU Vr)e- 
wsEA 


ii) There exists a trading strategy w% which achieves the supremum; and, 
iii) There exist positive constants Cı and C2 such that: 


CiaU(V) < aJ (V, M) < Coat (V), 


then, the maximum long-term growth rate of the expected utility of final wealth 
is achieved by the strategy wg. In addition, € is also the rate of the finite- 
horizon problem: 














1 
e= lim inf zr” (e am u )]]) ; (10.20) 


Moreover, consider TV, M) defined by: 














7(V,M) = sup lim inf z [U(Vr)e~°87) . 
wseA T—0o 
Then, if T(V, M) is finite for V > AM, we have: 
- The e functional TY, M) is homogeneous of degree a in V and M. 
- FV, M) is an increasing and concave function of V anda decreasing 
function of M. 
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Example 10.5 Case \=0 andéd=r 
- The optimal weight on stock is given by: 


— ÀM, M. 
up tE hia), 


: Hor 1 
a A 
ve a (1—-A\(1—a) +A 


When A = 0, we recover the Merton’s solution. 

Thus, for CRRA utility functions, the optimal strategy is equivalent to an 
investment in the risky asset which is proportional to the cushion V; — AMi. 
This is analogous to the CPPI method (see Chapter 9) but with a stochastic 
floor \M;. When we assume M; = K, we recover the standard CPPI method. 

The optimal weight w% , is an increasing function of the ratio H. When 
the portfolio value V; is close to the floor AM;, the optimal weight w% , is close 
to 0, since the guarantee must not be violated. 

- The optimal portfolio value V is the solution of the following SDE: 

dV, = k(Vi — AM) [(u — r)dt + odW;] with M: = max( Mo, V;). 


The process In ( % — åA ) is a regulated Brownian motion. Then, we deduce: 
M: 8 


1 
V; = AMo exp ((1 — A) Lt) +(Vo — AMo) exp [Ate + (rn — Zro?) t+ kow : 


with 


s<t 


1 
Lı = max b ; In ps — a) + (rn — to?) s+koW, — In(1 — y| : 
Mo 2 
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10.3 Value-at-Risk and expected shortfall based man- 
agement 


In this section, the guarantee condition is satisfied at a given probability 
threshold, not neccessarily equal to 1. Two cases are examined: 


e First, the analysis of portfolio asset management under dynamic safety 
constraints which control the probability that the yield falls under a 
given level. 


e Second, the expected utility maximization under VaR/CVaR constraints. 


10.3.1 Dynamic safety criteria 


Consider safety criteria such as those of Roy [437] and Kataoka [324], which 
are presented in a one-period setting, in Chapter 3. In what follows, we 
use results from Prigent and Toumi [418] and Toumi [492]. To simplify the 
exposition, the financial market is assumed to evolve in continuous time and 
is supposed to be dynamically complete (see [492] for the the discrete-time 
and incomplete cases). 


10.3.1.1 Roy Criterion 


There exists a riskless asset taken as numeraire. The risky asset price S 
is supposed to be a semimartingale S = (St)rejo,r] on a complete probability 
space (Q, F, P) with a filtration (Fi) ,<19,7 - 


The predictable strategy process € is assumed to be self-financing with an 
initial invested amount Vo > 0. Such strategy (Vo, £) is called admissible if 
the process defined by: 


t 
V; =v+ f EsdSs, Vt € [0, T], P — a.s. (10.21) 
0 


satisfies: 
V.>0 vt € [0,T],P-—a.s. 


Since the financial market is assumed to be complete, there exists a unique 
equivalent martingale measure Q ~ P. 


According to Roy’s criterion, the investor searchs for the best admissible 
strategy (Vo £) which maximizes the probability that the portfolio terminal 
value verifies Vr > e’’ RuyinVo, for a given minimal fixed return Rmin. 
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Therefore, the investor has to solve the following optimization problem: 








T 
wE Vo +f EsdSs > RMin Vo (10.22) 
0 
For fixed Rmin, consider the “success set” Arie £) defined by: 
Aye = {Vr > Rmin Vo}. (10.23) 


First step: We show that the Roy problem is equivalent to the determina- 
tion of a success set of maximal probability: 


PROPOSITION 10.5 
Let A € Fr be a solution to the problem 
max P[A], (10.24) 


under the constraint: 





1 
čo IA] < 
ellals a 


Let Č be the perfect hedge of the option H = I; € L'(Q), i.e, 











(10.25) 





























zo [1q/Fi] = Eo m+ f CdS, vte [0,T],  P—a.s. (10.26) 














Then (v; = rati) is the solution of problem (10.22), and the corre- 


sponding success set Ay, é is equal almost surely to A. 


REMARK 10.8 - The proof is detailed in [492]. It is based on results 
of quantile hedging, provided in Follmer and Leukert [234]. An alternative 
proof is given in Browne [95] for the (Gaussian) complete case. 


- The problem of constructing a maximal success set according to Roy cri- 
teria is then solved by applying Neyman-Pearson lemma, in a similar manner 
to the case of quantile theory in Féllmer and Leukert [234]. In fact, the Roy 
problem can be viewed as a quantile hedging of the constant H = Rmin Vo 
under the constraint that the initial capital Vo to invest is smaller than H and 
that Rmin is higher than 1 (note that here the riskless return is equal to 1). 0 











Since obviously Q[A] = E[L4], problem (10.24) is equivalent to the maxi- 
mization of P[A] under the constraint Q[A] < i: This is the reason why 
we can apply the Neyman-Pearson lemma. 

Let a be the threshold defined by: 











a= int {as OF > all] < 


IO See \ ; (10.27) 
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and consider the set 


A< {= 5 a} . (10.28) 


PROPOSITION 10.6 _ 
Let’s assume that the set A defined by the two previous relations satisfies: 


Q[Ã] = a. (10.29) 








Then the optimal strategy (v; = é) is the solution to problem (10.22). 


Vo 
oMa] 











Second step: We maximize the expected success ratio. 
The condition Q[A] = ã is clearly satisfied when: 
dQ ~ 
P|— = = 0. 10.30 
Etat Se 
Generally, it is not easy to find a set A € Fr defined by (10.30). In this 
case, the Neyman-Pearson theory suggests the replacement of the critical re- 


gion A € Fr with a random test, i.e, by a function y, Fr-measurable, such 
that 0<ọ<1. 


Let R be the set of these functions y and consider the following optimization 
problem: 









































D = E 10.31 
[2] a [y] ( ) 
under the constraint 
alg] < p_- (10.32) 
Min 


The Neyman-Pearson lemma proves that the solution Ø of (10.31, 10.32) 
has the form: 


P= lr aan} + Wy ean}, (10.33) 
where @ is given by (10.29), and where y is defined by: 
1/Rain - Q[% >a] 

alama 


This allows a solution for the dynamic Roy problem: 


y= (10.34) 


DEFINITION 10.2 Let (Vo, &) be an admissible strategy. We define the 


“success ratio” associated to this strategy as 


Vr 
Pvo,é = livr>Rmin Vo} + Ta Vg Vr < Rmin Vo}: (10.35) 
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Note that yv, E R and the set {yy,¢ = 1} coincide with the success set 
Av, = {Vr > Vo Rmin} associated to the strategy (Vo, €). 











We search for the strategy maximizing the mean of the success rate E [pwe] 
under the probability P over the set of admissible strategies: 








max {E [yv] : (Vo, €) admissible} . (10.36) 











PROPOSITION 10.7 
Let € represent the process determining the perfect hedging of the Fr— measurable 


function @ defined by (10.33). Then the strategy (Yo: 
to (10.36). 





Vo . . 
Fan £) is a solution 


Example 10.6 
Examine the Roy criterion in the Black and Scholes framework. 


The price process S of the risky asset is given by: 
dS; = Si(mdt +o dW;), 


where W is a Brownian motion under P and m is constant. To simplify, the 
interest rate is assumed to be null r = 0. 


The unique equivalent martingale measure is defined by: 

dQr m 1 /m\2 

Bee a, 
exp | s T OW | 


The process W = W; +  t, is a Brownian motion under Q and the risky 
asset price is equal to: 


1 
Si = So exp (ows — srr). 


For fixed Rmin, the optimal Roy strategy is the replicating strategy of 
the option H = Iz where A is written as A = {GE > a}, and where @ is 
determined from the relation 


1 














Eollz) = ——. 10. 
ofz] Ruin A132) 
Since we have: dP 
T m 
Set EG E 
0; BST)”, 


with 
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then we deduce: z 
A = {6(5r)Ë > ay. 


We consider only the case m > 0 (the financial market has a positive trend). 
For the case m > 0, the success set A has the following form: 


A=({Sr >c}, 











where c is determined from the relation Eg[I 4] = m= 





The Roy optimal portfolio has a payoff equal to the payoff of an option that 
can be written as: 


ST = sp > c} 
and the Roy success probability is then given by: 
b- ZT 

VT ), 


where ® is the cdf of the standard Gaussian distribution, and b is such that: 


P(A) = O(- 





1 
c = So exp(ab — 57 2): 


Setting 


1 c 1 
———Ln(—)— -ovT-t 
T ny) 37V l 


the value V; of the option to duplicate is equal to: 


d_(c,t) = 


Vi = Eš [Iz] = ®(d-(c,t)) 


and 
1 


Min 


b= -vT ( 





). 
Ovi 


The expressions of the Delta A(t, S+) = ag, St) and of the Gamma 
t 


T(t, S+) = ae (t, S+) allow us to study the variation of the quantity invested 
t 
in the risky asset. 


We obtain: 
62(t) = AME, S4) = a (—d_(c,t)?/2) , 
P(t, S;) = re + atte £)) exp (—d_(e,#)?/2). 
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Consider the following numerical values: 


So = 100, m= 0.03, o = 0.2, T=1, and Rmin = 1.2. 


: Portfolio value 


} 
> | 


f 


Portfolio value 





100 i i > OO Stock value 
FIGURE 10.6: Dynamic Roy portfolio payoff 


Probability 





Minimum return 


FIGURE 10.7: Probability of success as function 
of the minimal return RMin 
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REMARK 10.9 
- In order to maximize the success probability, the investor chooses to get 
exactly the minimum return for risky asset values above a given level. Indeed, 
if he would receive a higher return than the minimum one, the success prob- 
abililty would be reduced for the same initial investment. 


- Additionally, we note that for small values of the risky asset, the optimal 
portfolio payoff is null, which induces extreme losses. 


- To avoid such a problem, it would be preferable to use a criterion such as 
the expected shortfall, which takes more account of the loss sizes. 


- For large values of S;, the investor chooses to reduce the quantity A4(t, S;) 
invested in the risky asset. Once the condition 


Vr > RMin Vo 


is achieved, the investor is satisfied and the increase of the value of the asset 
does not interest him any more. However, for small values of S+, A4 (t, S+) is 
an increasing function of S+. 


- The success probability curve is obviously a decreasing function of Rmin, 
since it is more difficult to guarantee a larger return. Besides, 


P(A) =) 


is an increasing function of m. 

- Note also that P(A) is a decreasing function of the volatility ø. Actually, 
the event Vr > RyinVo is less probable at the expiry date if the volatility is 
larger. 
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10.3.2 Expected utility under VaR/CVaR constraints 


In what follows, we examine Value-at-Risk based management, as analyzed 
for example in Basak and Shapiro [47]. 


- The financial market contains a riskless asset with rate r. The price S of 
the d risky assets is assumed to be a semimartingale on a filtered probability 
space (Q, F, (F:)+,P) , and to be the solution of the SDE: 


dS, = diag(S;). [u (t) dt + o (t) dW], (10.38) 


where W is a d—dimensional standard Brownian motion and usual previous 
assumptions on coefficient functions are made. This financial market is com- 
plete. Thus, there exists one and only one risk-neutral probability Q defined 
as follows. 

Denote the relative risk process 7 by: 


with: 














T 
z | moita < 00, 
0 


The exponential local martingale L: 


ro =exp[-5 ints ds— f noaw), 


is the Radon-Nikodym density of the risk-neutral probability Q w.r.t. the 
probability P. 
Denote R the discount factor: 


R(t) = exp -f r(s)as : 
Denote M as the product: M(t) = R(t)L(t). 


- Expected utility maximization: the investor has a utility function on wealth 
U : (0,+00) — R. Using the martingale approach, the dynamic optimization 
problem under a VaR constraint is the following: 








maxE[U(Vr)], 
under E[MrVr] < Vo, 
P [Vr > Vmin] >l-e. 
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PROPOSITION 10.8 Basak and Shapiro [47] 


The optimal portfolio value V$ with VaR constraint is given by: 


J(yMr) if Mr < M, 
Vt = § Mmin if M < Mr < M, 
J (yMr) if M < Mr, 


where J = (U')~*, M is such that P [Mr > M] = € and the positive Lagrange 
multiplier y is such that E [|MrVrT] = Vo. 
The VaR constraint is binding if and only if M< M. Also, the Lagrange 


parameter y is decreasing in € and y € [yP ,y®1]. 














Define the quantity Mmin by: 


u, - [JOD ifM <™, 
Min ) Marin otherwise. 


The following figure plots the optimal portfolio value as a function of the 
state price Mr. The right thin curve corresponds to the portfolio B (case of 
no VaR contraint: € = 1). The left thin curve (PI) is the portfolio value when 
the insurance constraint Vr > Mmin must always be satisfied (£ = 0). 


Var 
Vr 





M M Mr 
FIGURE 10.8: Optimal portfolio value with VaR constraints 
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REMARK 10.10 Whenever the constraint is binding, the investor must 
reduce portfolio losses to satisfy the VaR constraint. Then, he chooses to 
increase portfolios losses in the “costly” states (i.e. M; > M). Unfortunately, 
these events correspond to the largest losses when there is no insurance VaR 
constraint. Therefore, the portfolio value with a VaR constraint has a fatter 
left tail! 


PROPOSITION 10.9 
(Power utility) 
Assume that U(V) = Y with a <1 andr and ņ are constant. 


- The optimal wealth is given by: 


VvaR = er) 
(yM) E 
fh —õ(—-dı(M)) — Maine"? &(—da(M)) (10.39) 
(yM) 1-@ 
+ |< 6(-di(M)) — Maine“"F- &(—da(M)) | , 
(yM) e 


where ® is the cdf of the standard normal distribution and 


Ma 


ro= (r+) 9+ (75) Mer, 








In(-) + (r — iy) (T-t) 
7 lnlivT = ¢ 
lnllvT -t 


l-a 


dı(x) = d2(x) + 
The fraction of wealth invested in stocks is given by: 


wyork = gy we, 


where the value w? and the exposure to risky assets relative to the benchmark 
B are: i 


Cera 





qR =1— Me "TÀ ((-d2(M))-8(-d2M))) 
var 


AG 


where ġ is the standard normal pdf. 
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10.4 Further reading 


Portfolio insurance constraints of the American type are examined in El 
Karoui et al.[192]. For example, for the CRRA case, the optimal solution is 
based on American puts. Optimal portfolios with controlled drawdowns are 
also determined in Cvitanic and Karatzas [138]. 


The dynamic minimization of expected shortfall is detailed in Pham [409] 
and Follmer and Leukert [235]. 


Cuoco et al. [135] introduce a dynamic VaR constraint. Contrary to the 
Basak and Shapiro results when the VaR constraint is static, they prove that, 
if the investor can fully use the current information, then the risk exposure 
of an investor, subject to a VaR constraint, is always lower than that of an 
investor with no VaR constraint. 


Emmer et al. [203] also study VaR-type constraints applied to the determi- 
nation of optimal portfolios with bounded Capital-at-Risk. They examine the 
continuous-time maximization of the expected terminal wealth under an up- 
per bound on the Capital-at-Risk. In a Black-Scholes framework, they deduce 
explicit formulae. They also consider generalized inverse Gaussian diffusions 
for which some qualitative results can be proved and simple simulation meth- 
ods can be proposed. 


The equilibrium of portfolio insurance is analyzed in Grossman and Zhou 
[276] and Basak [48], who examine the impact of portfolio insurance methods 
on market volatility. 


Portfolio insurance with stochastic dominance criterion is developed in El 
Karoui and Meziou [193] using a general concave criterion over all martin- 
gales with American constraint. The result is also used to determine utility- 
maximizing strategies. 


Chapter 11 


Hedge funds 


11.1 The hedge funds industry 
11.1.1 Introduction 


The development of hedge funds over recent years is due to different factors: 


e During the 1990s, the equity bull market had been in favor of finan- 
cial investment development. Higher returns provided by more complex 
portfolios had earlier been requested by investors with relative weak 
risk-aversion and who were searching for alternative investments. 


e Since the year 2000, to protect their capitals, investors have searched 
for hedge funds in order to diversify, and in order to limit exposure to 
main financial indices. 


Figure 11.1 illustrates the development of the hedge funds industry. 
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FIGURE 11.1: The hedge funds development 
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A hedge fund is a pool where the investor may be a (standard) shareholder 
with or without influence on the pool governance or may have a partnership. 


11.1.2 Main strategies 


The ranking of hedge funds by investment styles is not easy: 
e The classification is not standardized. 
e The number of subclasses is still growing. 


e The relative transparency of some hedge funds does not facilitate the 
analysis. 


Recall two usual classifications: HFR and CSFB Tremont. 


TABLE 11.1: HFR and CSFB classifications 
CSFE Tremont 


Emerging markets Emerging markets 
Equity hedge Long/Short equity 
Distressed securities 

Equity market neutral | Market neutral 
Equity non hedge 

Event driven Event driven 
Fixed-income Fixed-income arbitrage 
Market timing Managed futures 
Merger arbitrage 

Regulation D 

Relative value arbitrage 

Convertible arbitrage Convertible arbitrage 
Sector 

Short selling Dedicated short bias 
Statistical arbitrage 

Funds of funds 
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According to Amenc et al. [24], the classification can be also based on 
principal component analysis. This method of classification includes: 


e Convertible arbitrage and volatility arbitrage 


The goal is to use differences of prices for independent markets. The 
different risks are extracted: stock, rates, volatility, and credit. These 
relatively recent funds are rather complex. For example, their objective 
can be to provide the monetary rate plus a fixed excess return ©%. 


e Commodities trading advisors (CTA) 


A CTA fund (for example a “Trend Follower” fund) is based on market 
anticipations of futures and forwards, such as the exchange markets. No 
correlation with usual financial markets is provided by strategies which 
insure diversification. 


e Fusion/acquisition arbitrage 


Such funds are, for instance, the “Merger Arbitrage,” “Risk Arbitrage,” 
and “Event Driven.” Simultaneously, long and short positions are taken 
on firms implied in merger or acquisition process. 


e “Distressed” funds 


Bonds and stocks of a firm which has a bankruptcy risk will maybe have 
higher values in the future, if the firm can survive and grow again. 


e “Long/Short Equity” funds 


This is one of the oldest alternative strategies and one of the most 
important, according to the amounts invested in such a way. It uses 
all types of assets (stocks, options, etc.). The idea is to shortsell midcap 
or bigcap stocks. 


e “Fixed-Income Arbitrage” funds 


These are based on arbitrages on fixed-income markets: treasury bonds, 
futures and options on interest rates (e.g., swaptions, caps, floors), credit 
derivatives, etc. 


e “Global Macro” funds 


This strategy is based on macroeconomic variables: stock indices, ex- 
change rates, inflation rate, taxation policy, etc. Using information from 
macro-economic factors may allow better anticipation of unusual price 
variations. 
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11.2 Hedge fund performance 
11.2.1 Return distributions 


Hedge fund returns are not easy to estimate, since different biases can oc- 
cur, according to, for instance, Fung and Hsieh ([246],[{247]): 


- The “survival” bias; 
- The “selection” bias; and, 
- The “instant history” bias. 


e Data are not necessarily public and, therefore, some of them are not 
available. A “self reporting” bias can also affect the results. The selec- 
tion bias is also due to non-standardized selection criteria. 


e The survival bias is due to the fact that “badly” managed funds disap- 
pear, while the other ones remain. Therefore, funds for which we have 
data for a relatively long period are more successful. For example, Gre- 
goriou [268] examined data on the Zurich market along the time period 
from 1990-2001. The median of the life duration for a hedge fund is 
about 5.5 years. 


e The instant history bias is generated by differences between dates when 
funds are introduced in databases. New incorporated funds have gener- 
ally higher recent performances. 


e Very often, the assumptions of independence and stationarity are not 
validated. The flexibility and the short life of such funds also induce 
statistical problems. Hedge fund returns are not Gaussian; for example 
when derivative assets are included or specific dynamic strategies are 
used. 


Hedge funds also have risk exposures to volatility risk, default and credit 
risk, and liquidity problems. 

Therefore, all these sources of risk must be taken into account in order to 
measure the performance of alternative investments. 
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11.2.2 Sharpe ratio limits 


The performance measures introduced in Chapter 5 are mainly based on 
Markowitz criterion. Therefore, they are validated when standard assump- 
tions on asset returns, such as normal distributions, are satisfied, or when 
investors have utility functions depending only on the first two moments. 


As soon as probability distributions are no longer symmetrical, performance 
measures such as the Sharpe ratio may no longer be adapted. Strategies which 
do not require any anticipation of the fund manager can get higher Sharpe 
ratios than “buy and hold” strategies. 

Therefore: 


e We can search for alternative performance measures as in Leland [348]. 


e We can also profit from this inadequacy to maximize the Sharpe ratio 
as shown by Goetzmann et al.[257]. 


11.2.2.1 Sharpe ratio inadequacy 


11.2.2.1.1 Static strategy based on options Leland [348] considers a 
portfolio which is long on a stock index and short on a call written on this 
index. 


Therefore: 
e The portfolio payoff is a concave function. 


e This strategy consists of selling the index when it rises, and buying it 
when it decreases. 


e Then, the portfolio skewness is reduced. 
e No “market timing” or “stock picking” are required. 


e Then, this type of strategy provides fairly average returns but with 
possible severe losses. 


The following graph shows the profile of such a portfolio, for different strikes 
K (the call option is supposed to be priced by the Black-Scholes formula). 
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FIGURE 11.2: Portfolio profile (S — (S — K)*) as a function of stock value 
S 


The next table presents some of the characteristics of such a strategy for 
various values of the strike K. 
Consider the following market parameter values: 


So = 1, u = 12%, 0 = 18%, r = 3%, T = 1. 


The corresponding mean and standard deviations of the portfolio return 
per year are given respectively by 12.75% and 20.46%. 


Notations : 











(Rp) denotes the mean return of portfolio P. 

Gp and ap denote the beta and Jensen alpha of portfolio P w.r.t. the index 
S. 

RSp is the Sharpe ratio of portfolio P. 

op and £p are respectively the total risk and specific risk of portfolio P. 
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TABLE 11.2: Characteristics of the portfolio S — (S — K)* 





From the previous table, we can see that: 


- The alpha of portfolio P is always positive, while none of these portfolios 
requires specific anticipation of the fund manager. 


- Moreover, there exists an optimal portfolio with respect to alpha criterion. 
Besides, the Sharpe ratio is a function of the exercise price K. 


- There exists also a value of K which maximizes the Sharpe ratio. Note 
that the Sharpe ratio of the index is equal to 0.4743. 


- Therefore, using the Sharpe ratio, it is possible to dominate a “buy-and- 
hold” strategy by a static one based on a long position on a call, and a short 
position on the underlying asset. 


- In addition, the total risk of the portfolio is an increasing function of the 
strike. At the limit, the portfolio risk converges to that of the index when 
the strike goes to infinity. Note also that the specific portfolio risk is first 
increasing then decreasing w.r.t. the strike K. 


- Contrary to this strategy which always has a positive alpha, we can con- 
sider a strategy which always has a negative alpha. To do so, it is sufficient to 
introduce a portfolio which is made up of the index and of a put written on 
the index. As seen in Chapter 9, this portfolio corresponds to an insurance 
strategy with a payoff that is a convex function of the index. 
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The following figure shows the profile of such a portfolio as function of the 
terminal value S of the index, according to various strikes H of the put. 

















FIGURE 11.3: Portfolio profile (S + (H — S)T) 





The next table provides some of the characteristics of this strategy using 


the same market parameters as before. 


TABLE 11.3: Characteristics of the portfolio S + (H — S)* 


Portfolio: S$ + P 





o) bp 
H=0 12.75 1 
H=0 11.22 0.924 
H = 9.41 0.801 
H =1. 7.34 0.624 
H =1. 5.58 0.434 
A =1. 4.38 0.271 
A=1. 3.68 0.154 


2.818 
4.650 
5.922 
6.153 
5.464 
4.309 


As with the previous strategy, this portfolio insurance strategy also does not 
require any anticipation by the fund manager. However, it generates alphas 
which are systematically negative and which depend on strikes. 
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Note that the Sharpe ratio is decreasing w.r.t. the strike K. Thus, the 
portfolio insurance strategy which maximizes the Sharpe ratio corresponds to 
the value K = 0. It is the index itself. Its Sharpe ratio is equal to 0.4743. 

Consequently, as soon as an investment in the stock is hedged by a put 
written on it, its performance as measured by the Sharpe ratio is reduced and 
this reduction is increasing with an increasing protection (i.e., the strike K 
increases). 

However, as seen in Chapter 10, these types of strategy may be interesting 
for investors. 

Goetzmann et al. [257] propose an example of a strategy that requires a 
perfect knowledge of the financial market, but which has a weak Sharpe ratio. 


11.2.2.2 Alternative measure and optimal Sharpe ratio 


Two approaches can be proposed: 


e First, as proposed by Leland (1999), we can search for a modified CAPM 
with a beta that can better measure the portfolio risks for any probabil- 
ity distribution. Then, the alpha of strategies based on options would 
be equal to 0. 


e Second, it is also possible to determine the portfolio strategy which 
maximizes the Sharpe ratio, as in Goetzmann et al. [257]. 


11.2.2.2.1 Alternative definition of alpha and beta for the CAPM 
Leland [348] introduces an alternative definition for parameters alpha and 
beta in the CAPM framework. This new valuation model takes account of all 
the moments of the probability distribution. 

It is based on a model introduced by Rubinstein [438], which is based on a 
valuation model through a power utility. 


Under such assumptions, the CAPM is: 
[Rp] = Ry + Bp ( 




















KAJ 


[Rm] — Ry), (11.1) 








where 


Cov |Rp,—-(1 + Rm) | 
~ Cov[Ru,—(1+ Ru)~7] 

Note that for y = —1, the utility function is quadratic. Then, we recover 
the standard CAPM formula, since: 


P (11.2) 


= Cov [Rp, Rm] 
= Cov [Ru Rm] ` 
The risk aversion coefficient y of the representative investor is given by: 
ek (1+ Raj) —In(E[1+ Ry]) 
Var [In(1+ Ru) 


Bp (11.3) 


























y= (11.4) 
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The new definition of alpha is: 














Ap = (E[Rp|Ia] — Ry) — Bp (E [Ru] — Ry). (11.5) 














The term Ig denotes the information available for the fund manager. At 
market equilibrium, for the CAPM, Ap must be null. It will be so if the fund 
manager has private information and good anticipations. Note that the new 
alphas of portfolios S — C and $+ P are null for all strikes. 


11.2.2.2.2 Strategy which maximizes the Sharpe ratio Recall that 
the Sharpe ratio for the market index is equal to 0.4743. The following figure 
presents the Sharpe ratio of portfolios which contain one unit of index and —a 
units of call, as function of the strike K using the same financial parameter 
values as before. 
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FIGURE 11.4: Sharpe ratio as a function of the strike K 


The maximal Sharpe ratio is reached for the following values: 
a = 0.7826 and K = 0.992. 


This is equal to 0.5251 and so is higher than that of the index. 


Instead of searching for performance measures which cannot be manipulated 
by strategies based on options, Goetzmann et al. [257] propose to determine 
strategies which maximize the Sharpe ratio without requiring any skill. 
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They prove that the best static strategy has a probability distribution which 
is right truncated and has a left fat tail. Such a strategy can be approximated 
by a combination of a put and a call. 


The portfolio is based on an investment of one unit in the index, on the 
selling of a European calls with strikes K, and on the purchase of b European 
puts with strikes H and (K > H). 


The portfolio value at maturity is given by: 
Vr = Sp — a.(Sr — K)* +b.(H — Sr). (11.6) 
Thus, its initial value is equal to: 
Po =1-aC(1,T,0,r;K)+0.P(1,T,0,r;H). (11.7) 


At maturity, the profile is concave as shown by figure 11.5, which corre- 
sponds to portfolios maximizing the Sharpe ratio. 


o.2 


0.6 0.8 l 1z 1.4 
FIGURE 11.5: Portfolio profile maximizing the Sharpe ratio 


Using now a units of call and b units of put, we get a maximal Sharpe ratio 
equal to 0.531. This new maximum corresponds to the following values: 


a = 0.705, K = 0.846, and b = 1.93, H = 1.126. 


Note that the main part of the Sharpe ratio increase is obtained from only 
one option. 
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These results can be compared with those shown in Chapter 10 when max- 
imizing expected utility with insurance constraints. 

In addition, Goetzmann et al. [257] show that the Sharpe ratio is not very 
sensitive to the choice of strikes. Therefore, options close to the spot index 
price can be used since they are more liquid. 

To summarize, if we are only interested by the performance measure, results 
of Goetzmann et al.[257] suggest that the probability distribution of a fund 
which has a high Sharpe ratio, would be compared with the strategy which 
maximizes the Sharpe ratio. In particular, this is true when we consider the 
performance measure of hedge funds based on style analysis. 


11.2.3 Alternative performance measures 


Chapter 2 deals with the choice of appropriate risk measures. In particular, 
downside risk measures are analyzed. Performance measures for hedge funds 
can be based on such risk measures which focus on loss risk. 


11.2.3.1 The semi-variance 


The Sharpe ratio is based on a dispersion risk measure around the mean. 
Therefore, it does not allow us to determine if the variations are below or 
above the mean. The semi-variance, SV, is a possible tool to avoid such a 
problem: 




















SV(R)=E [ee -mT i (11.8) 


The semi-variance takes account of such assymetry. Lower partial moments 
are extensions of the semi-variance. They are defined by: 











| [(E[R] — R)*]”], (11.9) 











where the values of parameter p are above 2, which allows us to take better 
account of assymetries and potential high risks. 


11.2.3.2 Sortino ratio 


This is one of the most famous ratios. It takes account of loss expectations 
(“downside risk”). It is defined by (see Sortino et al. [475], [476], [477]): 








Sor(R) = pS ae (11.10) 


[(L— R)+}"] 








= 











where L denotes the minimal acceptable return level (MAR). 

This ratio is based on the same principle as the Sharpe ratio. However, 
the riskless rate is replaced by the level L, that is by the minimal acceptable 
return level, while the standard deviation of the return o(R) is replaced by 
the standard deviation of those returns that are below the level L. 
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The choice of the level L can be made according to different criteria: 


e In order to control the loss risk, we can take L = 0. 


e If we consider the riskless rate Ry as a benchmark, we can consider the 
value L = Rr. 


e If we want to compare the performance of funds with each other, the 
level L can be chosen equal to the mean of return expectations of these 
funds. 


As seen in what follows, (unfortunately) this choice is crucial to rank the 
funds. 


11.2.3.3 The Omega performance measure 


This performance measure was introduced by Keating and Shadwick [326]. 
Contrary to standard performance measures, such as the Treynor and Sharpe 
ratios or the Jensen alpha, it takes account of the whole probability distribu- 
tion. 

The Omega measure considers both the gain and loss probabilities. It is 
defined by: 


_ Se Q— F(a) de 
fv F (x) de 


The function F(.) is the cumulative distribution function of the financial 
assets with range (a,b) and w.r.t. the probability distribution P and the 
reference level L chosen by the investor. 

For a given level, the investor would always prefer the portfolio with the 
highest Omega value. 

The Omega ratio can be written as: 


On(L) (11.11) 








Ep [(X - L)* 
Qr (L) = —-————+.. (11.12) 
Ep kr- x) 


i—i 

















aie ) 


REMARK 11.1 Therefore, the Omega function Nr, (L) is the ratio of 
the expectation of gains above the level L on the expectation of losses below 
the level L. As mentioned in Kazemi et al. [325], the Omega ratio can be 
viewed as the ratio of a call on a put having the same strike L written on 
the same underlying asset (the portfolio value or its return) but with values 
computed w.r.t. the historical probability P instead of a risk-neutral one. 
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The Omega function also satisfies the following properties: 








For L = cP [X], Op, (L) =i; 








Or, (.) is a decreasing function. 


Thus, from the previous property, we deduce that: 


- For levels of L smaller than the mean, the Omega ratio is positive. 


- For values of X higher than the mean, the Omega ratio is negative. 


Or, (.) = Qe, (.) if and only if F = G. 


When portfolio returns always have identical Omega ratios, their prob- 
ability distributions are equal. 


The Omega ratio is compatible with the second-order stochastic domi- 
nance: 
X z2 Y => Oe, (L) > Ow, (L). (11.13) 


As seen in Chapter 1, this is an interesting property. 


Kazemi et al. [325] define the Sharpe Omega ratio as follows: 














Ep [X] -L 


Sharpeg (L) = Be [xy] 


=p, (L) — 1. (11.14) 














The upper term is the same as for the Sharpe ratio when the level L is 











equal to the riskless return. The risk measure Ep [a -X F] replaces 





the usual standard deviation. 


Example 11.1 
Consider the following standard case: The payoff X is the value at maturity 
T of a stock S which is assumed to follow a geometric Brownian motion: 


X = So expl(u— 0?/2)T + oWrl, 


where (W;): is a standard Brownian motion which consequently is such that 
Wr has a Gaussian distribution VV (o, vT). 


Then, we have: 











Ep [X] = So exp[yT]. 





Therefore, the mean return does not depend on the volatility. 
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Consequently: 


- If So exp[wT] < L, then the Sharpe Omega ratio is an increasing function 
of the volatility (due to the vega of the put). 


- If So exp[uT] > L, then the Sharpe Omega ratio is a decreasing function 
of the volatility. 


Generally, the level L is chosen smaller than the mean (since it represents 
a loss w.r.t the expected value). Therefore, for the standard “buy-and-hold” 


strategy, the performance measure Omega is indeed decreasing w.r.t. the 
usual volatility risk. 
U 


In the following numerical example, the Omega performance measure is ap- 
plied to four hedge funds. The time period is 1992-2004. 

Two types of hedge funds are examined: 

1) Hedge/Convertible Arbitrage: A and D. 


2) Hedge/Equity Market Neutral: B and C. 


The estimation of parameters is made on monthly data. 


TABLE 11.4: The four hedge funds characteristics 


Characteristics Fund A Fund B FundC Fund D 


Mean 0.70 0.92 0.35 0.66 
Median 0.80 0.80 0.27 0.58 
Variance 0.54 0.51 11.74 2.13 
Skewness -0.74 0.65 0.35 0.56 


Kurtosis 4.33 3.11 4.56 4.50 
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Their monthly returns are represented as follows: 
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FIGURE 11.6: The monthly returns of the four hedge funds 


The figure plots the Omega ratio as function of the reference level L: 
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FIGURE 11.7: Omega ratio as function of the threshold level L 
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For this particular example, we can see that: 


- None of the four funds dominate another one for all values of the threshold 
L (fund D for example dominates the three other ones for small values of 
L but it is dominated by the other ones for high values of L). However, 
fund B has a Omega ratio which is always above that of fund A, which 
may lead to stochastic dominance. 


- Note that the choice of the reference level L is crucial, in particular around 
“rational” values such as the riskless rate where the ranking changes 
very quickly. Therefore, if the level of L is not an absolute reference, it 
seems necessary to define an additional criterion to determine this level. 
For this purpose, we can take into account the investor’s risk aversion, 
the remuneration of the fund manager, or we can introduce weighted 
Omega ratios (judgment of financial experts, expectation for a given 
probability distribution on the level, etc.). 


11.2.3.4 The ASRAP measure (“Alternative Style Risk Adjusted 
Performance” ) 


In order to take into account the management style, Lobosco [360] proposes 
the “style risk adjusted performance” measure (SRAP). 

The risk is measured by the volatility. Since hedge funds generally have 
asymetrical probability distributions with fat tails, it is necessary to examine 
at least their moments of orders 3 and 4. 


For this purpose, the VaR developed by Cornish and Fisher [131] can be 
introduced to measure the risk. This is an extension of the standard VaR, 
which includes the skewness and the excess kurtosis of the return distribution. 

The first step consists of calculating the VaR based on a Gaussian distri- 
bution then considering the extended VaR of Cornish-Fisher. 


Define q by : 





1 3 1 3 2 
yi ; — (293 — 5qc)s?, 11.1 
q= qe + =q — 1)s 54 (de 3qc)k 36 q — 5c) 8 (11.15) 


where qe is the critical value at the probability level (1 — a), s is the skewness, 
k is the excess kurtosis. 


Then the adjusted VaR is given by: 
VaRcr = —(u- q0), (11.16) 


where u and ø are respectively the mean and the standard deviation of the 
probability distribution. Note that, if this distribution is Gaussian, then s = 0 
and K = 0, so qe = q. In that case, it is equal to the usual VaR. 

We deduce that the ARAP measure is defined by: 


VaRor(Iror) 


ARAP = 
VaRor(HF) 


(Rur — Ry) + Ry, (11.17) 
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where [ror is an index FOF, and HF denotes the hedge fund to be examined. 


The SRAP measure is determined by the difference between the measure 
RAP of the portfolio and the measure RAP of the style benchmark, which 
corresponds to the portfolio style: 


ASRAP = ARAP(fund) — ARAP (style index). (11.18) 


11.2.4 Benchmarks for alternative investment 


The alternative investment searches for absolute performance. However, 
the use of the riskless return as a benchmark is not always a convenient tool 
for all styles of hedge funds, in particular when their beta is not null, and if 
the implicit assumptions which validate the CAPM are not satisfied. 


Nowadays, hedge fund performance is measured more and more relative to 
a style benchmark. 
Such a benchmark must satisfy the following properties: 


- Transparency - The list of hedge funds that are included in the benchmark 
must be detailed and the way to compute their performance must be 
specified. 


- Representativeness - A large set of hedge funds must be considered while 
excluding those which are too small or the management of which is too 
hazardous. 


- Weighting - The use of the respective capitalizations of different funds is not 
easy to handle. The recent development of many funds and their lack 
of standardization does not facilitate this weighting. Therefore, equal 
weights are often considered. 


- Accessability of the hedge funds. 
- The “reporting” frequency. 
To satisfy some of the previous required properties, we can: 


e Analyze the return of a hedge fund w.r.t. the return of a given portfolio 
having the same strategy, but in a passive way (passive benchmark). 
Agarwal and Naik ((9],[10]) introduce multifactorial models in order to 
analyze the types of assets and strategies used by the hedge funds. 
These factors include the strategies based on options on observable as- 
sets, as suggested in Agarwal and Naik [9]. This approach is further 
studied in Schneeweis and Spurgin [456]. They prove that the perfor- 
mance measure of hedge funds can actually be based on option strate- 
gies. 
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e Compare the return of a given fund with that of a representative index 
(active benchmark). 


e Consider a pure style index. This type of index is defined as the true 
index which is not observable. 


11.2.5 Measure of the performance persistence 


Among studies about performance persistence of hedge funds, we have: 


- Brown et al. [93] who examine if the future returns can be predicted from 
past observations. Their conclusion is that this is not the case. 


- Agarwal and Naik ((7], [8]) conclude that there exists a short term perfor- 
mance. They examine annual returns during the time period 1982-1998. 
Their results prove that the performance persistence level is significant 
for a monthly observation, but it is reduced for annual observations. 


Park and Staum [402] show that there exists a performance persistence for 
CTA’s. 


- Elton et al. [198] and Brown et al. [93] argue that the survival bias may 
induce performance persistence. 


- Harri and Brorsen [282] show that some funds are actually performant over 
a sufficiently long time period. 


- Baquero et al. [41] take account of the “look-ahead bias” which is induced 
by the multiperiodic sampling. This bias can induce a difference of 
about 3.8%. However they confirm the persistence. 


The general conclusion is that there exists a performance persistence, for 
both the “losers” and the “winners.” Note also that the performance is linked 
to management costs received by the fund managers, as noted by example 
Caglayan and Edwards [100]. 


Usual statistical methods are based on: 


e Regression of the returns; in particular, the behavior of the alpha coef- 
ficient is examined. 


e Style analysis. 


e Correlation tests, such as Spearman’s test. 
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11.3 Optimal allocation in hedge funds 


The management of a fund which itself uses other funds such as hedge funds 
allows for diversification. This is due to the weak correlation of hedge funds 
with standard financial assets, as shown in the next figure, where a hedge 
fund index is compared with standard financial indexes. 
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FIGURE 11.8: Correlation of hedge funds/standard funds 


We can see that returns of hedge funds can evolve independently from 
traditional assets (confirmed by the correlation values). 

What is the percentage of hedge funds to include in a fund or a portfolio? 

Obviously, the answer is not easy: 


e Some funds cannot include more than a given percentage (for example, 


10%). Otherwise, their category is changed. 


e A minimal diversification is desirable to profit from weak beta and high 


alpha. 


Note that nowadays most financial institutions include hedge funds. From 
a theoretical point of view, as a first step, a mean-variance analysis can be de- 
veloped as in Schneeweis and Spurgin [456], who determine the mean-variance 
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frontier, including the S&P500, the fixed-income index Lehman Brothers, and 
a hedge fund index, the EACM 100. They conclude that hedge funds must be 
used. This analysis is further extended by Cvitanic et al. [141], who introduce 
a model based on a dynamic mean-variance criterion. In this framework, they 
prove that the model risk reduces the percentage invested in hedge funds. 
However, as mentioned by Lhabitant [355], the mean-variance criterion does 
not take account of moments with orders higher than 2, which strongly limits 
the analysis. It is the reason why for example Chabaane et al. [111] introduce 
various optimization criteria such as the expected shortfall. 


11.4 Further reading 


Lhabitant [354] provides an overview of hedge funds including description, 
classification, and performance. Fung and Hsieh [248] examine the hedge 
fund survival lifetimes, while in [249] they show how hedge fund strategies 
may involve risk. Brooks and Kat [91] examine statistical properties of hedge 
fund index returns and deduce their implications for portfolio managemen- 
t. The book of Gregoriou et al. [268] contains a survey about performance 
persistence. Bookstaber and Roger [83] highlight the problem of measuring 
performance of portfolios with options. Bacmann and Scholz [36] propose 
alternative performance measures for hedge funds. Bonnet and Nagot [81] 
propose a class of performance measures in order to evaluate alternative in- 
vestment, regardless of assumptions on payoff. The representation of these 
measures involves the Log-Laplace transform of the asset distribution - among 
them: the squared Sharpe ratio, the Stutzer’s rank ordering index and the 
Hodge’s Generalized Sharpe ratio. Cascon et al. [107] provide some mathe- 
matical properties of the Omega measure. 

Avouyi et al. [27] apply the Omega to the portfolio allocation choice prob- 
lem, using Threshold Accepting. They introduce conditional copula to take 
into account non-Gaussian returns, extreme joint movements, time-varying 
dependence, and volatility. They apply these methods to a portfolio com- 
posed of three total stock market indices (US, UK and Germany). Amenc 
and Martellini [23] also examine portfolio optimization involving hedge funds 
using an improved estimator of the covariance structure of hedge fund index 
returns. Using data from CSFB-Tremont hedge fund indices, they conclude 
that ex-post volatility of minimum variance portfolios is between 1.5 and 6 
times lower than that of a value-weighted benchmark. Thus, inclusion of 
hedge funds in a portfolio can potentially generate a dramatic decrease in 
the portfolio volatility without lower expected returns. Krokhmal et al. [337] 
also endeaver to optimize a portfolio of hedge funds. They examine linear 
rebalancing strategies using Value-at-Risk and CVaR criteria. 


Appendix A 


Appendix A: Arch Models 


Linear and non linear processes 


Time series have been introduced in particular to describe and to predict 
discrete-time dynamics. One of the most popular models is the so-called au- 
toregressive moving average process (ARMA). The current value of the series 
is a linear function of its own lagged values and of current and past values 
of a “noise” process, usually called the innovation process. However, it is set- 
up in a linear framework (rather strong approximation), and no constraint 
is usually imposed on the moving average parameters (which does not allow 
taking into account structural relations). Financial time series, for example, 
exhibit non-linear dynamics and are submitted to structural constraints such 
as equilibrium conditions. 

Therefore, a new family of time series have been introduced by Engle [204]: 
the ARCH (Autoregressive Conditionally Heteroscedastic) models. These mod- 
els allow consideration of nonlinear time series models. They can also be 
applied to path dependent volatility models. 


Weak and strong stationarity 
A stationary time series is stationary if it has no trend and no seasonality. 


It is homogeneous with respect to time. More precisely: 


DEFINITION A.1 A time series (X+) is weakly stationary if: 























(X1) Se E(X:) = ... = p, 
Cov(Xi, Xiph) = (h), Y(t, h) E N x Z. (A.1) 











In particular, the autocovariances only depend on the time period h and not 
on current time t. 

A time series (X;)z is strongly stationary if for any n and any ty < ... < tn, 
the probability distribution of the random vector (X+z,,...,X+,,) depends only 
on n. Strong stationarity obviously implies weak stationarity. 
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Example A.1 
A well-known non-stationary time series is the random walk (RW): 


Xi = Xt-1 + Et, 


where (€,), is a standard Gaussian white noise (WN) (i.e. the random vari- 
ables £+ are independent and have the same probability distribution, which is 
the standard Gaussian law M (0, 1)). 

Such series “diverge.” Consider for example a stationary time series which 
is a first-order autoregressive process (AR1): 


Y, = $.Yı—1 +e, with |d| < 1. 


The behaviors of the two time series are illustrated in the next figure (for 
Xo = Yo = 0). 





60 100 150 200 250 300 350 400 460 500 
— RW car AR! 


FIGURE A.1: Random walk (RW) and autoregressive process of order 1 
(AR1) 





The stationary series varies around its mean (equal to 0) whereas the ran- 
dom walk has a high variance since we have: 


N N 
Xt= Xot Y a= o’ (X+) )=0° (Soe) = 
t= 1. t=1 


Note that the latter equality proves that the random walk is not stationary. I 


Standard econometrics analysis is based on stationarity and is not adapted 
to non-stationary time series as shown by the following example of Granger 
and Newbold [264]. 
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Example A.2 Fallacious regressions 

Consider a linear regression between two independent random walks RW1 
and RW2 such that (the numbers between braces are the Student’s statistic 
values): 


RW l; = 3.5167 + 0.59 RW2; + €t, with an R? = 0.72. 
(7) (72.25) 


We note that the variable RW2 would allow us to explain the dynamics of 
RW1: there exists a linear relation between these two random variables, 
whereas they are independent. This surprising result is due to their common 
propensity for diverging. However, if we consider two independent Gaussian 
white noises el) and eP (thus stationary processes), no linear relation be- 
tween them appears: 


2) 


e() = —0.0148— 0.086 eP +e, with an R? = 0.018. 


(—0.66) (—0.841) 


None of the previous coefficients is significant. l 


ARMA processes 


The current value of X; is a linear function of the past values of X and of 
past and current values of a white noise process £. Define: 


Xe, = (X,..., Xt-1). 











Denote also by LE [Xt | X. trl the linear regression of X; (the best prediction 
of X; by means of a linear function of X4, ..., X:~1). 





DEFINITION A.2 A second order stochastic process is: 
1) an autoregressive process of order K if and only if: 




















[Xe |X] = E [Xi |Xe-1, .., X-K], Vt. 








2) a linear autoregressive process of order K if and only if: 











LE [Xi |X] = LE [X: |X, .., Xt- ], Yt. 




















An ARMA representation is defined as follows: 
X= C4 O,X,-1+...4+ ®,Xt_p + Et — Oyep_1 — ... — Ogét—¢; (A.2) 


where ®,...,®, and O1, ..., Oq are square matrices and C is a vector. 
The autoregressive and moving average lag-polynomials are given by: 





(L) = Id+,L —...—6,L?, 
Q(L) =Id-O,L -...— @,L4, (A.3) 
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where L denotes the lag operator, [(X;,) = X;-1. The ARMA representation 
is then defined by: 


B(L)(X:) = C + O(Ler. (A.4) 


The coefficients ®),...,®, and O1,..., Oq are usually constrained. For ex- 
ample, the roots of the equations: 


det (®(z)) = 0 and det (O(z)) = 0 
are supposed to be outside the unit circle (|z| > 1). Indeed, under this latter 
assumption, the polynomials det (®(z)) and det (O(z)) can be inverted. This 


leads to alternative representations of the process X: 


(infinite moving average representation) X; = ®(L)~'C + ®(L)~'O(L)e,, 
(infinite autoregressive representation) O(L)~'®(L)X; = O(1)7'C' + er. 
Thus ARMA processes are relatively tractable. Note that they can be 


considered as truncated approximations of weakly stationary processes due to 
Wold’s decomposition. 


THEOREM A.1 Wold’s theorem 


Consider a weakly stationary process (X;)4 such that 








lim E [Xi+k |X] =r [X] R 


k—oo 




















which means that no information at an infinite horizon is given from the 
observations prior to time t. 
Then, this process always has an infinite moving average representation: 


Xt = Co + A(L)er, (A.5) 
where (€4)_ is a sequence of homoscedastic noise variables, Vex = Q, which 


are uncorrelated, Cov(éz, €v) = 0, Vt, Vt’, with zero mean, E [e+] = 0, and such 
the coefficients Aj, ...Aj;,...satisfy the following stability condition: 














j=0 


Note that the process € satisfies: 





= Xi- L a [X lee], 











which is the reason why this process is called the innovation process. 
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ARCH model 


The Autoregressive Conditionally Heteroscedastic model takes account of 
time dependent conditional variances. For example: 


2 
Xt = €4_1€t, 


where £ is a Gaussian white noise with variance o°. The process X is weakly 
stationary and the conditional variance depends on lagged residuals: 


Var (X: X1) = Var (e?et |X) = 0e]. 


ARCH volatility models 
The univariate case 


A standard ARCH volatility process of order p can be written as: 


k 
Ri = 5 Pn Rien + Et, 
n=1 
P 
o? =aot >> aigi (A.6) 
i=1 


where o? is the conditional variance of the innovation process e+. The previous 
representation is based on a moving average of the squares of the innovations. 
The coefficients a9 > 0, a; > 0, i = 1,...,p are assumed to be positive. 
Under the assumption 5°?_, a; < 1, the time series is stationary. 

This representation has been extended by Bollerslev et al. [80] who intro- 
duce the GARCH(p, q) model for which the conditional variance satisfies the 


following equation: 
p q 
o? = ao + 5 ait + 5 Bjo; (A.7) 
i=1 j=1 


In that case, the variance is defined as the sum of an autoregressive term 

and of a moving average of the squares of the innovations. The conditional 

variance is stationary if the condition)>?_, a; + 74, ĝi < 1 is satisfied. 
More generally, the term )>?_, a; +)", 6i indicates the persistence degree 


of the variance, since the non-conditional variance can be written as: 


2 Qo 


J le art Di bi 


oO 
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When S7?_, ai +}; 6: = 1, the non-conditional variance is infinite. The 
conditional variance follows an IGARCH process (Integrated GARCH). 

Note that if the standardized innovations (z; = $+) have a standard Gaus- 
sian distribution, then the marginal distribution of the innovation process e 
is characterized by a kurtosis which is always higher than the standard Gaus- 
sian one. In order to fit fat tail distributions, other innovation processes can 
be introduced, for example GED or Student distributions. 

In order to take asymmetrical distributions into account, several models 
have been introduced by Nelson [398], Engle and Ng [208], Glosten et al. 


[256], 


and Zakoian [509], etc. 


GJR model (Glosten, Jagannathan, and Runkel [256]) defined by: 
o? = ao + aci + Yle? + boka, 


with ao > 0, a 2 0, a+ y È 0 and 8 > 0. The process is stationary if 
Btat+y/2<1. 
TGARCH model (Threshold GARCH), Zakoian [509] defined by: 


of = ao +a |era] + yle-1 ler) + bot, 


with ag > 0,a 2 0,a+y È 0 and @ > 0. The volatility process is 

stationary if: 

a? + (a +7)? 
2 


2a+ y 
V 2T 


EGARCH model (Exponential GARCH) (Nelson [398]) defined by: 


B+ +28 <N 





lno = ao + a(|z-1l|— E |ze-1|) + yz-1 + Bln o?_4, 
where E |z+-1| = \/2/7 under the normality assumption. 
The volatility process is stationary if 6 < 1. 
If the long term volatility is non constant, we can consider “component- 
models.” 


These models allow us to take into account mean-reverting properties: 


of -U=a (ef 4 = qi—1) +8 (o? = qi-1) ; (A.8) 
ge = wt p(qi-1—w) +o (E41 — O71). (A.9) 
The term q is the long term volatility component. Equation (A.8) 


describes the transient volatility component, o? — q@, which converges 
to zero with speed (a + 8). 


Equation (A.9) defines the long term volatility component qs, which 
converges to w with speed p. 
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e To take account of asymmetry, the TARCH model can be used: 


OF — q =a (e1 — t1) +7 (621 — t1) l-1 + B (071 qea). 
(A.10) 


In all previous relations, 8 is the autoregressive term, a is the effect of a 
shock on return, and y is the asymmetry effect corresponding to an additional 
impact of a negative shock. Thus, for the GJR and TGARCH models, the 
effect of a positive shock is measured by a, and the effect of a negative shock 
is measured by a+ y ( y is assumed to be positive). 

For the EGARCH model, the effect of a positive shock is measured by a+ 
and the effect of a negative shock is measured by a — y. In that case, y must 
be negative so that a negative shock has no higher impact than a positive 
shock. 


REMARK A.1 The comparison of these asymmetrical models can be 
based on different response curves to innovations which illustrate the inno- 
vation effects on the conditional variance (News Impact Curves) proposed by 
Engle and Ng [208]. 


The multidimensional case 


Consider a N-multidimensional GARCH model: 
Yı = uUu+£t, with Et | lii ~ N (0, Hi) s 
Two basic examples of such models are: 


- Diagonal VECH: 


P q 
H, = Ao + So Ai Q Hii + 5 Bi Q Bb sete 


i=l i=l 


- BEKK (see [206]) : 


P q 
H, = Ao Aa + XO AHi AP + XB: (er-1€7_1) Rr, 


i=1 i=l 


Consider for example the multidimensional GARCH model which is called the 
DCC-MVGARCH (Dynamic Conditional Correlations Multivariate General- 
ized Auto Regressive Conditional Heteroscedastic model) introduced by Engle 
and Sheppard [209]. 


Et | lı ~ N (0, H+) ; 
H, = D:RiD:, 
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where e are the residuals from the filtration, D+ is a diagonal matrix with 
coefficients which are stochastic standard deviations generated by univariate 


GARCH processes, and R; denotes a stochastic correlation matrix. 


The log-likelihood is defined by: 


L= -— (klog (27) + log (|Hi|) + e; H7 tet), 


N| =| 
Mns 


= 
Il 
m 


(klog (27) + log (|Di:Ri:D;|) + Dy + R7 +D] *e) , 


II 

| 
N| = 
Ms 


+ 
Il 
m 














Mns 


1 : : = 
5 2 (klog (27) + log (|Del) + log (Rel) + mt Re ine) 5 


+ 
Il 
m 


where 7, ~ N(0, R+) are the residuals, standardized by their conditional 
standard deviations. In Engle and Sheppard [209], the coefficients of D; are 
univariate GARCH processes given by: 


P; Qi 
2 
Ha = wi + 5 QipEit—-p + X ight, 
p=1 p=1 


where His is the usual conditional GARCH variance and, for i = 1,2,...,k, 
usual non-negativity constraints hold together with the stationary condition: 


Yoon + Ye Aut it— pd. 


p=1 


The dynamic correlations are: 


M N M 
a= (1 Yon Soh) Os +) om (m - i a> Ree: 
m=1 


m=1 n=1 
—1 —1 
=Q] QQ, 


where a and ( are the weights, Q is the non conditional covariance of stan- 
dardized residuals, QF is a diagonal matrix with coefficients which are squares 
of the diagonal coefficients qi; of the matrix Q+, and M and N are the DCC 
lags. Note that the coefficients of R, are given by: 


ijt 


s/liifjj ; 


Pijt = 
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Appendix B: Stochastic Processes 


Stochastic basis, filtration, stopping times 


The notion of filtration is used to represent, for example, the flow of possible 
observations of prices on a financial market which is available for traders and 
portfolio managers. 


DEFINITION B.1 Suppose (Q,F,P) is a probability space. 

A filtration (F;), is an increasing family of sub-sigma-fields Fg C F. A 
stochastic basis is a probability space equipped with a filtration. Increasing 
means that if s < t, then Fs C Fy. Fy is usually interpreted as the set of 
events that occur before or at time t. Generally, Fy represents the history of 
some process observed up to time t, but other possible histories are allowed. 


It is assumed that the “usual conditions” are satisfied: 
i) Complete : every P-null set in F belongs to Fo and so to all F. 


ii) Right continuous : Fy = Ns>tFs. 


DEFINITION B.2 A continuous time stochastic process X is a family of 
random variables (X+): defined on (Q,F,P), indexed by t, which take values in 
(E,€). Hence, for allt, X; is a random variable with values in E. Moreover, 
for each fixed w (which represents a “state of the world”), t + X1(w) is a 
function defined on |0,T], called a path or a trajectory of the process X. 


The concept of Martingale is crucial in the modern theory of finance. If 
the price process M of an asset is a martingale, the conditional expectation 
at time s of the future value M; of the stock at time t is given by its current 
value M,. 


DEFINITION B.3 A martingale (resp. submartingale, resp. super- 
martingale) is an adapted process X on the basis (Q, F, (Ft), P) whose paths 


381 


382 Portfolio Optimization and Performance Analysis 


are all right continuous and left limited (rcll) P—almost surely, such that every 
X is integrable and such that for s < t: 











Xs = Ep[Xi|F5] (resp. Xs < Ep| X| Fs], resp. Xs > Ep[X:|F5]). 





























In order to model dynamics of financial assets, several types of stochas- 
tic processes are introduced: Brownian motion, Poisson processes, and, more 
generally, on Lévy processes, diffusions, diffusions with jumps, point process- 
es, and martingales (see Shiryaev [469]). 


Semimartingales and stochastic integrals 


The class of semimartingales is the class of stochastic processes that is 
“rich” enough and sufficiently “tractable.” It contains the previous processes 
and is stable under many of the usual transformations: localization, change 
of measure, change of filtration, and change of time. Finally, it is possible to 
define stochastic integrals with respect to semimartingales, which leads to the 
famous Stochastic Calculus and its very powerful results, like the Ito formula. 


DEFINITION B.4 1) A semimartingale is a process X which has the 
following decomposition (not unique) 


X=X+A+M, 


where Xo is finite-valued and Fo— measurable, M is a local martingale with 
Mo = 0, and A has finite-variation. 

2) A special semimartingale is a semimartingale X for which A is moreover 
predictable. Furthermore, this decomposition is unique and called the canoni- 
cal decomposition of the special semimartingale X. 


There exist several ways to construct stochastic integrals (see e.g., Protter 
[419] for details). 
If X has finite-variation and if H is a bounded process, the integral 


t 
HX.= f HsdXs 
0 


is directly defined as the Stieljes integration path-by-path. But this con- 
struction excludes such fundamental processes as, for instance, the Brownian 
motion whose paths almost surely have no finite variations over each finite 
interval. Martingales in general, and also Markov processes, are similarly 
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excluded. This is a standard problem with semimartingales X having no 
finite-variation, since the measure dX,(w) is not defined. 

So a special construction of a stochastic integral had to be developed while 
taking into account that stochastic integrals must be defined from limit of 
sums as usual integrals. For this purpose, restrictions must be made on both 
integrand and integrator. 


DEFINITION B.5 A process H is said to be simple predictable if H has 


a representation 


i= Holyoy(t) + XC Hiler, Taa); (B.1) 


i=l 


where 0 = To < Ti < ... < Thai < œ is a finite sequence of stopping times, 
H, € Fr, with |H;| < œ a.s., andO<i<n. 


DEFINITION B.6 Let X be a right continuous and left limited process. 
Define the linear mapping associated to stochastic integrals with respect to 
process X by, for any previous process H : 


Jx(H) = HoXo + X Hi(X™ — X”), (B.2) 


i=1 
whenever m 
Hy = Holyoy(t) + XO Hil er, T116), 
i=1 


where 0 = To < Ty <... < Thu < œ is a finite sequence of stopping times, 
H, € Fr,, with |H;| < œ a.s., andO<i<n. 


Jx(H) is called the stochastic integral of H with respect to X. 


Example B.1 

Consider the standard Brownian motion W. Recall that W can be charac- 
terized as a continuous stochastic process with independent and (Gaussian) 
stationary increments (i.e. for each s < t, W; — Ws is independent of F, and 
its distribution is the Gaussian law with mean 0 and variance equal to t) with 
Wo = 0. 

Let (Pin)n be a refining sequence (i.e. Pin C Pim if m > n) of partitions of 
(0, co) with mesh sizes converging to 0 as n goes to infinity. Consider 


We = 5 L ¢t,,te4a]- 


tkEPin 
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W,, converges to W. Now fix t and assume that t is a partition point of each 
Pin. Then, 


Jw(Wa)= Wi, (Wie — WW"), 
tkEPin,tk<t 
and 
Jw(W) = lim Jw(Wah= $, Wa Wasi — Wee) 
tkEPin,tk<t 
= Jim) DO 1/2Wi, + Wing1)Wirgı — Win) — 1/2(Wnyi — Wa)? 


tkEPin,tk<t 


=1/2W? -1/2 lim SX Wap Wa) 


tkEPin,tk<t 


Now, examine the last term on the right. Denote 


Yk = (Wear Wa)” (tha tk). 





The sequence (Y;)x is an iid sequence of random variables with zero mean. 
Thus 





























E 5 (Wess — Wir)? — (tk+1 — te) =) EY]. 


th EPin k 


Moreover, (FracWy, ai — Wtstk+1 — th)? has the distribution of the square 
of a Gaussian random variable Z with zero mean and variance equal to 1. 
Therefore 




















E| 5 (Wirya — We)? — (thoi — te)| < E[(Z? — 1)°] mesh(Pin )t, 


tkEPin 











which converges to 0 as n goes to infinity. Thus, the last term on the right 
converges to t (L? convergence, but almost-surely convergence can also be 
proved (see Protter [419])). Consequently, 


t 
J W,dW, = 1/2 W? —1/2t, 
0 
which distinctly differs from the Riemann-Stieljes integral formula. 
For processes A with continuous paths of finite variations, the term 
ere Aha A,,)* converges to 0, which explains the difference between 
path-by-path Riemann-Stieljes integrals with respect to processes A and 


stochastic integral with respect to processes such as the Brownian motion. 1 


The quadratic variation of a semimartingale is a very convenient tool. 
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DEFINITION B.7 1) The quadratic co-variation of two semimartingales 
X and Y, denoted by [X,Y], is defined by: 





[X,Y] = XY = XoYo —- RY -Y_.X. (B.3) 


2) The quadratic variation of a semimartingale X is |X, X] and so: 
t 
[X,X], =X? -X - 2 | X,-dXs. (B.4) 
0 


Ito’s formula for semimartingales 


In what follows, Dif denotes the first partial derivative of the function f 
w.r.t. xi and Dijf denotes the second partial derivative of the function f 
w.r.t. zi and gzj. 


THEOREM B.1 

Let X = (X1,...,X%) be a d-dimensional semimartingale and f a class C? 
twice continuously-differentiable function on Rt. Then f(X) is a semimartin- 
gale and: 


f (Xt) = f(Xo) + X Di f(X_).X*+1/2 XO Dyf (X-) (X5, XP) 


i<d ij<d 


ES | F(Xs) - =S DA GAX: 


s<t i<d 


Doléans-Dade exponential formula 


This notion is very useful to the financial theory, since most asset price pro- 
cesses are in fact Doléans-Dade exponentials. Moreover, it is also relevant to 
change of measures to compute, for example, the Radon-Nikodym derivatives 
of the risk-neutral probabilities. 

Consider the equation 


Y = 1 + Y_.X (or equivalently dY = Y_dX and Yo = 1), 
where X is a given semimartingale and Y an unknown rcll adapted process. 


By analogy with the ordinary differential equation du = y, Y is called the 
exponential of X. 
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THEOREM B.2 


If X is a semimartingale then the above equation has one and only one 
rcll adapted solution (up to indistinguishability) which is a semimartingale, 
denoted by E(X), and given by 


E(X)¢ = exp (Xi — Xo — 1/2(X°, X°)) [0 + AX.)eo*. (B.5) 


s<t 
Recall some basic properties of this exponential: 


PROPOSITION B.1 
1) If X has finite variation, then so has E(X) which is given by 


E(X) = exp (Xi — Xo) [0 + AX,)e4* 


s<t 
2) If X is a local martingale then so is E(X). 
3) Let T = inf{t : AX; = —1}. Then, E(X) 40 on [[0,T|[, E(X_) #0 on 
[[0, T]] and E(X) = 0 on [[T, oof. 
Note also the following useful property: 


PROPOSITION B.2 
Let X and Y be two semimartingales with Xo = Yo = 0. Then 


E(X)E(Y) = E(X +Y + [X,Y]). (B.6) 


Example B.2 


As in the Black and Scholes model, consider a rate of return X described by 
a Brownian motion with drift. Thus, we have : 


Xt = ut F ow, 
where js and ø are constants, and (W;); is a standard Brownian motion. Then, 
the stock price process S$ is solution of the equation S = Sg+S_.X. Thus S is 


the Doléans-Dade exponential of X and, since (oW, oW}; = o7t, one obtains 
the very well-known geometric Brownian motion 


St = So exp ((u — 1/207)t + oW;) . 
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Markov processes and stochastic differential equations 


This paragraph is devoted to the Markov property of solutions to stochastic 
differential equations. It contains a brief survey on infinitesimal generators 
for diffusions. 

One of the motivations for the development of stochastic integrals was the 
study of diffusions (i.e. the continuous strong Markov processes) as solutions 
of differential equations of the form: 


t t 
roe i; fle, Xs)ds-+ I ge Kaw 
0 0 


where (W;); is a standard Brownian motion, and f and g are sufficiently reg- 
ular functions to define a unique continuous solution which is strong Markov. 
From the general concept of semimartingale differentials, the terms ds and 
dW, can be replaced by general semimartingales that should have indepen- 
dent increments so that the solution is Markovian. As mentioned in Protter 
[419], a “naive” definition of a Markovian process looks like a weakening of 
the property of independent increments: 


DEFINITION B.8 A process X with values in R? and adapted is said 
to be a simple Markov process with respect to the filtration (Fijt, if for each 
s > 0 the o—fields Fs and o(X+, t > s) (information uniquely from time s) 
are conditionally independent given Xs. 


This is in fact equivalent to: for t > s and for every f bounded and Borel 
measurable, 


























| f(Xt)|Fs] = ELX) le (Xs)]. (B.7) 


“The best prediction of the future given the past and the present is the 
present.” 

From the previous relation, one can define a transition function for a Markov 
process as follows : for s < t and f bounded and Borel measurable, 


Pst(Xs, f) = EIF (X)| Fs] . (B.8) 














Letting f(x) = Uc(x), the preceding relation reduces to 
P(X; € C\Fs) = Ps4(Xs, 10). (B.9) 
also denoted by P; +(X., C). 
If for any s < t, the transition function satisfies the relationship 


Ps = Po1—s, also denoted by P(t — s) , (B.10) 
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the Markov process is said to be time homogeneous and the transitions func- 
tions are a semigroup of operators, known as the transition semigroup(P(t))¢. 
In the time homogeneous case, the Markov property becomes, for any u > 0: 


P(Xtqu € C|F:) = P(u, Xt, C). (B.11) 


Thus, a function P(t, x, C) defined on (0,00) x R? x B(R®) is a time homo- 
geneous transition function if: 
1) For each (t,x), P(t, x,.) is a probability on (R4, B(R®)). 
2) For each x, P(0,x,.) is the Dirac measure 6,(C) at x (since Xo = 2). 
3) For each C € B(R®), P(.,.,C) is real-valued, Borel measurable, and bound- 
ed. 
4) P(t,x,C) satisfies the following relation, called the Chapman-Kolmogorov 
property: for each (t, x, C), t and u > 0: 


P(t+ u, x,C) = J Peev, CPt z, do). (B.12) 


The probability measure m defined on (R4, B(RÎ), by m(A) = P[Xo € C] 
is called the initial distribution of the process X. 

A transition function for X and the initial distribution m determine the 
finite-dimensional distributions of X by 


P(Xo € Co, Xt, € Ciis Xt, € Ch) = 


Se. ees Sons P(tn = tnais Yn—1; Cy,)..-P(t1, Yo, dy1)m(dyo) x (B.13) 


It can be required that the Markov Property holds for stopping times. 


DEFINITION B.9 A time homogeneous simple Markov process X is 
Strong Markov if for any stopping time T with P[T < œ] = 1 and u > 0, 


P(Xr4u € C|Fr) = P(u, Xr,C), (B.14) 


or equivalently, for any function f bounded and Borel measurable 











ELF (XT+u)|Fr) = Plu, Xr, f). (B.15) 





In the previous definition, the strong Markov property has been defined for 
only time homogeneous processes but, if X; is a Rĉ-valued simple Markov 
process, then Y; = (X+, t) is a R¢++-valued simple Markov process. 


Such examples of Markov processes are the Brownian motion and the Pois- 
son process and typical solutions of stochastic differential equations. 

A first standard example of such stochastic differential equations is given 
by the Ito diffusion processes. Consider the case of the evolution of a stock 
price subject to random shocks (resulting, for instance, from a multitude of 
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stock trade orders). If b(t, x) is the trend of the price at the point x and time 
t (a kind of “velocity” ), then to describe the value X; at time t, it may be 
reasonable (under some assumptions like independence of the shocks) to use 


an equation like 
dx, 
oT = b(t, X,)dt + o(t, Xe, 
where (€;)z is a white noise and o(t, x) measures the amplitude of the shocks 


(the very well-known volatility in finance). 


The mathematical interpretation of this equation, due to Ito is 
dX, = b(t, Xz)dt + o(t, X1)dW; , (B.16) 


where (W;); is a Brownian motion. 


This equation can be immediately extended to the case X is R¢-valued and 
(W;)+ is a p-dimensional Brownian motion. b(t, x) € R? is usually called the 
drift, and a(t, x) € R?*” the diffusion coefficient. 

In fact, since the works of Ito, the above equation is written with the use 
of stochastic integrals : if Xo is given, then the solution is 


t t 
X= Xo+ | Ws, X.)ds+ f o(s,X,)dW. , (B.17) 
0 0 


where (W;); is a Brownian motion, and b(t, x), o(t,x) are appropriately s- 
mooth to ensure the existence and the uniqueness of solutions. 

Intuitively speaking, if (F;), is the underlying filtration of the Brownian mo- 
tion, then for small e and for all 7,7 < n, 


E[X},. — X}|Fi] = b(t, XeJe + fe), 

















E(X ipe — X$ — b(t, Xe) (Xie — X$ — W(t, Xe)e)| Fi] = o*t (t, Xe)e + ole) 





(notation: o’ denotes the transpose of the matrix o) 
The basic properties of Ito diffusions are : 


1) The Markov property. 
2) The strong Markov property. 
3) The generator A of the process X can be expressed in terms of b and ø. 


Two conditions are usually introduced on the coefficient functions b and o 
to guarantee the existence and the uniqueness of the solution. For example, 


THEOREM B.3 
Let T > 0, 0(.,.) : [0,T] x R? — R? and o(.,.) : [0, T] x RI”? — R? be 


measurable functions satisfying 
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1) 
[o(t,2)|-+ || o(t,2) |< C0 + lal), (B.18) 


for some constant C (where |.| is the Euclidian norm on Rĉ, and || o ||?= 
2 ei) and the functions t — b(t, x) t > a(t, x) are continuous. 
2 

|b(t, y) — b(t, x)|+ || o, y) — a(t, z) |< Dily— a, (B.19) 


for some constant D. 

Let Z be a random variable which is independent of the o— field FYW generated 
by the Brownian W and such that E[Z?| < oo. Then the stochastic differential 
equation 


aX, = b(t, X,)dt + a(t, X4)dW; yt < T, with Xo =Z 


has a unique solution X which is continuous with respect to t and with com- 
ponents X’ that are adapted and satisfy E[Supr<r(X})*] < oo. 














The first assumption is called a linear growth condition. The second is a 
locally Lipschitz condition. 


Two standard examples of diffusions are described in what follows. 


Example B.3 Diffusions 
1) The stochastic exponential eW+71/% is a diffusion with b(t,2) = 0 and 
olt, x) =a. 


2) The Ornstein-Uhlenbeck process can be defined as follows: 

AX; = k(m = X;)dt + odW;, 
where k is a non-negative constant and Xo is independent of the Brownian 
W. Note that X is a Gaussian process (all finite-dimensional distributions are 


Gaussian). This process is often used in the term structure modeling since it 
has the mean-reverting property. In fact, since 


t 
X,=m+(Xo—mje + o f et) ayy, , 
0 











then E[X;] = m + (Xo — m)e~™ which tends to m as t goes to infinity. 





[ 


In fact, the definition of a diffusion is not standardized. For example, a 
process may be called a diffusion if it has continuous sample paths and if it 
satisfies the strong Markov property. Under the assumptions of the previous 
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theorem, the unique solution of the stochastic differential equation is a strong 
Markov process and, by construction, its paths are continuous. 


Examine now the generator of a diffusion, solution of the stochastic differ- 
ential equation 


dX, = b(t, X;)dt + a(t, XdW; yt S T, with Xo =Z. 


Then, we deduce : 


PROPOSITION B.3 
If f is twice continuously differentiable functions with a compact support then 
f is in the domain D(A) of the operator A and 


pp Cah) a 1/2 X` (ohal) : (B.20) 


tj<d 


PROOF Itis based on the Ito’s formula applied to Y = f(X). 


dY = six X)dX,+1/2 X` 


tj<d 





(X)d( XF, XF 
a IAA 











from which, denoting by E”[.] the conditional expectation with respect to the 
event Xo = x, it can be deduced that 


EX) = (2) 


Pe ; af 
+E Sž )+ re) A) +2 D eO | a 


i,j<d 





























Then, the proposition is established by applying the definition of A 











Af (x) = lim L ULF (X:)|Xo = x] — f(2)}. 


t—0 t 





Example B.4 Generators 
1) The d—dimensional Brownian motion, which of course is the solution of 


dX, = dWi, 


has a generator A given by, for any function f in C2(R®), 


Af=12 54x 
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Thus A = 1/2 A where A is the Laplace operator. 
2) The Ornstein-Uhlenbeck process 

dX, = k(m — Xı)dt + odW,, 


has a generator A given by 


2 
AE k= nob + 1/2022 l 


[ 


Other standard concepts for Markov processes are the following ones (see 
e.g. Oksendal [401]): 


1) The Dynkin formula. 


PROPOSITION B.4 
Assume that f is in CZ(R¢). If T is a stopping time with E[T|Xo = 2] < œœ 
then 






































fa 
ELf(Xr)|Xo = a] = f(x) + Lf Af (Xs)ds|Xo = a]. (B.21) 














For example, if T is the first exit time of a bounded set, then E[T'|Xo = 
x] < œ and the above formula is valid. 


Consider for example the d-dimensional Brownian motion W starting at 
xo E€ RI. Assume that |a| < R. Then, by applying the Dynkin formula, the 
expected value of the first exit time Tr of the Brownian motion from the ball 


{x € R$; |x| < R} 


is given by 














[T'R|Wo = ao] = 1/d (R? — |xo]’) . 


2) The Kolmogorov’s backward equation. 


Consider an Ito diffusion X in R? with a generator A. If f is a function in 
C3(R¢), then by using the Dynkin formula with T = t, it is obvious that 














u(t, x) = Elf(Xr)|Xo = 4] 
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is differentiable with respect to t with a differential given by 


Ou 
Ot 














=E*|Af(Xr)]. 


Therefore, we have the following result: 


PROPOSITION B.5 Solution of the backward equation 


1) If f is in C2(R®), then u(t,.) is in D(A) for each t and 





du — Ay t>0,2¢€R?¢ 
Ot ? 3 ’ 
u(0,x) = f(x), reR?. (5.22) 











2) Let C1?(IR*) be the set of all functions continuous with respect to t and 
twice continuously differentiable with respect to x. If a function w(.,.) is 
bounded and in Cl?(R*) and satisfies (B.22), then w = u. 


3) The Feynman-Kac formula. 
This is a generalization of the Kolmogorov’s backward equation. 


PROPOSITION B.6 
If f is in CÊ(R?) and g is continuous on R? and lower bounded. Then v(t, .) 


defined by 
v(t, z) = E” [ezp (- i aX.) FX 


2 — Av — gv, t> 0, x E Rİ 
= ; 
Oa FC), TER. 2) 


2) Let C1?(IR*) be the set of all functions continuous with respect to t and 
twice continuously differentiable with respect to x. If a function w(.,.) is in 
C12 (R2), is bounded on K x R? for each compact subset K in R, and satisfies 
(B.23) then w = v. 


is the solution of 














An application of this is the determination of the generator of a killed 
diffusion. Consider an Ito process X solution of 


dX, = b(X4)dt + a(X1)dW; . 
Its generator is given by 


=X b2) Ly 1/2 X. (00 hald g: 


i<d i, j<d 
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Consider now the process X which is the process X killed at the random time 
Tz 
X= Xan ift<T, 


and X; is undefined if t > T. Denote the function c(x), which is the killing 
rate defined by 


C(x“) = ii 1/t P[X is killed in the time interval (0, t)|Xo = z] . 


Then, X is also a strong Markov process and 






































EX) = E*[f (Xe) <r] = PX ezp | c(Xs)ds)] , 


for all bounded continuous functions f on Rt. Thus, the generator A of X is 
given by 


= Sst +120 (00! 52) Go - (a) f(@) 


i<d tj<d 


As it can be seen, the use of the generators may lead to the resolution 
of partial differential equations (PDE), in particular when calculating option 
prices. Consider the following pricing problem: 

Let A be the operator 


of 
Ot 


of 


Af (t,z) = AE 


a? 
OF ie EEN aye Coe ty e E 
Ox? 
where b and ø satisfy the assumptions of Theorem (B.3). 
For a given function g (the payoff of a European option, for example), find 
the solutions f of the next parabolic equation (f(t, X+) will be the price of 


the option at any time t according to the value of the stock X; at time t): 


Af (t,2) = rf (t,x), Forallt € [0,T], Vz € R$, 
f(T, x) = g(a). 


Consider X** the Ito process defined by, for all u > t, 


(B.24) 


xe xeta f o(,.xp)ds+ f ofs,xPaW,, 
t t 


with the initial condition X/’ = a. Then, if f is the solution of (B.24), 
applying the Ito formula, f satisfies 


flu, XE = eea) fe Es, XP0(s, XENAM,. 
t 
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If the previous stochastic integral is a martingale (with a mild integrability 


assumption on of), then it can be deduced that 


























f(t,2) = Ele" Fg Xp")] = Ele g(X7)| Xr = a]. 
Example B.5 Basic models 
1) Black and Scholes model - The well-known equation is 


Af (t,x) = fire OF 41/2024 L=rf(t,2), 


f(T, x) = g(e) = (1 K)*. (B.25) 


So 





f(t, x) = Efe r(T Xp aa K)*|X; =r]. 


2) Cox-Ingersoll-Ross model - This process is usually used in the term struc- 
ture modeling. 























Thus, it is important to calculate such expectations as E [exp(- f rudulFs)]| 
when the spot rate r is given by: 


dr, = a(b — ri)dt + py rtdWz. 














Using the Markov property, it remains to calculate E [exp(- Ie rydu)]. For 


this, consider the solution r2' of 


drt =a(b— rot \ds +p re"'dW., pt =f 

















Introduce : 
7 
ftx) =E fev (-f rs) ,f(T,2) =1. 
t 
Then, f is solution of the PDE : 
of of 2 Of 
zr te- oaz +1/2p 712 = 2073 


Now, results concerning PDE are applied to determine the solution which is 
given by 
f(t, x) = (T — thexp(—r(t)U(T — t)) , 


2ab 


; oveltta)e/2 ET 2(e%5—1 
with ®(s) = (S555) ” 9? =a +20, U(s) = me [ 





Ito diffusions are used in most financial models. Nevertheless, other semi- 
martingales can be introduced instead of the Brownian motion; other Lévy 
processes for example (see Cont and Tankov [128]). So, it is interesting to s- 
tudy stochastic differential equations driven by more general semimartingales 
(see Protter [419]). 
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extensions, 303 
standard, 295 
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representation, 197 
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extensions, 286 
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extensions, 140 
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representation, 40 
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